public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/10] raid1 balancing methods
@ 2024-11-15 14:54 Anand Jain
  2024-11-15 14:54 ` [PATCH v3 01/10] btrfs: initialize fs_devices->fs_info earlier Anand Jain
                   ` (11 more replies)
  0 siblings, 12 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

v3:
1. Removed the latency-based RAID1 balancing patch. (Per David's review)
2. Renamed "rotation" to "round-robin" and set the per-set
   min_contiguous_read to 256k. (Per David's review)
3. Added raid1-balancing module configuration for fstests testing.
   raid1-balancing can now be configured through both module load
   parameters and sysfs.

The logic of individual methods remains unchanged, and performance metrics
are consistent with v2.

----- 
v2:
1. Move new features to CONFIG_BTRFS_EXPERIMENTAL instead of CONFIG_BTRFS_DEBUG.
2. Correct the typo from %est_wait to %best_wait.
3. Initialize %best_wait to U64_MAX and remove the check for 0.
4. Implement rotation with a minimum contiguous read threshold before
   switching to the next stripe. Configure this, using:

        echo rotation:[min_contiguous_read] > /sys/fs/btrfs/<uuid>/read_policy

   The default value is the sector size, and the min_contiguous_read
   value must be a multiple of the sector size.

5. Tested FIO random read/write and defrag compression workloads with
   min_contiguous_read set to sector size, 192k, and 256k.

   RAID1 balancing method rotation is better for multi-process workloads
   such as fio and also single-process workload such as defragmentation.

     $ fio --filename=/btrfs/foo --size=5Gi --direct=1 --rw=randrw --bs=4k \
        --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 \
        --time_based --group_reporting --name=iops-test-job --eta-newline=1


|         |            |            | Read I/O count  |
|         | Read       | Write      | devid1 | devid2 |
|---------|------------|------------|--------|--------|
| pid     | 20.3MiB/s  | 20.5MiB/s  | 313895 | 313895 |
| rotation|            |            |        |        |
|     4096| 20.4MiB/s  | 20.5MiB/s  | 313895 | 313895 |
|   196608| 20.2MiB/s  | 20.2MiB/s  | 310152 | 310175 |
|   262144| 20.3MiB/s  | 20.4MiB/s  | 312180 | 312191 |
|  latency| 18.4MiB/s  | 18.4MiB/s  | 272980 | 291683 |
| devid:1 | 14.8MiB/s  | 14.9MiB/s  | 456376 | 0      |

   rotation RAID1 balancing technique performs more than 2x better for
   single-process defrag.

      $ time -p btrfs filesystem defrag -r -f -c /btrfs


|         | Time  | Read I/O Count  |
|         | Real  | devid1 | devid2 |
|---------|-------|--------|--------|
| pid     | 18.00s| 3800   | 0      |
| rotation|       |        |        |
|     4096|  8.95s| 1900   | 1901   |
|   196608|  8.50s| 1881   | 1919   |
|   262144|  8.80s| 1881   | 1919   |
| latency | 17.18s| 3800   | 0      |
| devid:2 | 17.48s| 0      | 3800   |

Rotation keeps all devices active, and for now, the Rotation RAID1
balancing method is preferable as default. More workload testing is
needed while the code is EXPERIMENTAL.
While Latency is better during the failing/unstable block layer transport.
As of now these two techniques, are needed to be further independently
tested with different worloads, and in the long term we should be merge
these technique to a unified heuristic.

Rotation keeps all devices active, and for now, the Rotation RAID1
balancing method should be the default. More workload testing is needed
while the code is EXPERIMENTAL.

Latency is smarter with unstable block layer transport.

Both techniques need independent testing across workloads, with the goal of
eventually merging them into a unified approach? for the long term.

Devid is a hands-on approach, provides manual or user-space script control.

These RAID1 balancing methods are tunable via the sysfs knob.
The mount -o option and btrfs properties are under consideration.

Thx.

--------- original v1 ------------

The RAID1-balancing methods helps distribute read I/O across devices, and
this patch introduces three balancing methods: rotation, latency, and
devid. These methods are enabled under the `CONFIG_BTRFS_DEBUG` config
option and are on top of the previously added
`/sys/fs/btrfs/<UUID>/read_policy` interface to configure the desired
RAID1 read balancing method.

I've tested these patches using fio and filesystem defragmentation
workloads on a two-device RAID1 setup (with both data and metadata
mirrored across identical devices). I tracked device read counts by
extracting stats from `/sys/devices/<..>/stat` for each device. Below is
a summary of the results, with each result the average of three
iterations.

A typical generic random rw workload:

$ fio --filename=/btrfs/foo --size=10Gi --direct=1 --rw=randrw --bs=4k \
  --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 --time_based \
  --group_reporting --name=iops-test-job --eta-newline=1

|         |            |            | Read I/O count  |
|         | Read       | Write      | devid1 | devid2 |
|---------|------------|------------|--------|--------|
| pid     | 29.4MiB/s  | 29.5MiB/s  | 456548 | 447975 |
| rotation| 29.3MiB/s  | 29.3MiB/s  | 450105 | 450055 |
| latency | 21.9MiB/s  | 21.9MiB/s  | 672387 | 0      |
| devid:1 | 22.0MiB/s  | 22.0MiB/s  | 674788 | 0      |

Defragmentation with compression workload:

$ xfs_io -f -d -c 'pwrite -S 0xab 0 1G' /btrfs/foo
$ sync
$ echo 3 > /proc/sys/vm/drop_caches
$ btrfs filesystem defrag -f -c /btrfs/foo

|         | Time  | Read I/O Count  |
|         | Real  | devid1 | devid2 |
|---------|-------|--------|--------|
| pid     | 21.61s| 3810   | 0      |
| rotation| 11.55s| 1905   | 1905   |
| latency | 20.99s| 0      | 3810   |
| devid:2 | 21.41s| 0      | 3810   |

. The PID-based balancing method works well for the generic random rw fio
  workload.
. The rotation method is ideal when you want to keep both devices active,
  and it boosts performance in sequential defragmentation scenarios.
. The latency-based method work well when we have mixed device types or
  when one device experiences intermittent I/O failures the latency
  increases and it automatically picks the other device for further Read
  IOs.
. The devid method is a more hands-on approach, useful for diagnosing and
  testing RAID1 mirror synchronizations.

Anand Jain (10):
  btrfs: initialize fs_devices->fs_info earlier
  btrfs: simplify output formatting in btrfs_read_policy_show
  btrfs: add btrfs_read_policy_to_enum helper and refactor read policy
    store
  btrfs: handle value associated with raid1 balancing parameter
  btrfs: introduce RAID1 round-robin read balancing
  btrfs: add RAID1 preferred read device
  btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status
  btrfs: fix CONFIG_BTRFS_EXPERIMENTAL migration
  btrfs: enable RAID1 balancing configuration via modprobe parameter
  btrfs: modload to print RAID1 balancing status

 fs/btrfs/disk-io.c |   1 +
 fs/btrfs/super.c   |  22 +++++-
 fs/btrfs/sysfs.c   | 181 ++++++++++++++++++++++++++++++++++++++++-----
 fs/btrfs/sysfs.h   |   5 ++
 fs/btrfs/volumes.c |  86 ++++++++++++++++++++-
 fs/btrfs/volumes.h |  14 ++++
 6 files changed, 286 insertions(+), 23 deletions(-)

-- 
2.46.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 01/10] btrfs: initialize fs_devices->fs_info earlier
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-11-15 14:54 ` [PATCH v3 02/10] btrfs: simplify output formatting in btrfs_read_policy_show Anand Jain
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

Currently, fs_devices->fs_info is initialized in btrfs_init_devices_late(),
but this occurs too late for find_live_mirror(), which is invoked by
load_super_root() much earlier than btrfs_init_devices_late().

Fix this by moving the initialization to open_ctree(), before load_super_root().

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/disk-io.c | 1 +
 fs/btrfs/volumes.c | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 814320948645..ab45b02df957 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3321,6 +3321,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 	fs_info->sectors_per_page = (PAGE_SIZE >> fs_info->sectorsize_bits);
 	fs_info->csums_per_leaf = BTRFS_MAX_ITEM_SIZE(fs_info) / fs_info->csum_size;
 	fs_info->stripesize = stripesize;
+	fs_info->fs_devices->fs_info = fs_info;
 
 	/*
 	 * Handle the space caching options appropriately now that we have the
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1cccaf9c2b0d..fe5ceea2ba0b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7568,8 +7568,6 @@ int btrfs_init_devices_late(struct btrfs_fs_info *fs_info)
 	struct btrfs_device *device;
 	int ret = 0;
 
-	fs_devices->fs_info = fs_info;
-
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry(device, &fs_devices->devices, dev_list)
 		device->fs_info = fs_info;
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 02/10] btrfs: simplify output formatting in btrfs_read_policy_show
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
  2024-11-15 14:54 ` [PATCH v3 01/10] btrfs: initialize fs_devices->fs_info earlier Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-11-15 14:54 ` [PATCH v3 03/10] btrfs: add btrfs_read_policy_to_enum helper and refactor read policy store Anand Jain
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

Refactor the logic in btrfs_read_policy_show() to streamline the
formatting of read policies output. Streamline the space and bracket
handling around the active policy without altering the functional output.
This is in preparation to add more methods.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index b843308e2bc6..fd3c49c6c3c5 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1316,14 +1316,16 @@ static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 	int i;
 
 	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
-		if (policy == i)
-			ret += sysfs_emit_at(buf, ret, "%s[%s]",
-					 (ret == 0 ? "" : " "),
-					 btrfs_read_policy_name[i]);
-		else
-			ret += sysfs_emit_at(buf, ret, "%s%s",
-					 (ret == 0 ? "" : " "),
-					 btrfs_read_policy_name[i]);
+		if (ret != 0)
+			ret += sysfs_emit_at(buf, ret, " ");
+
+		if (i == policy)
+			ret += sysfs_emit_at(buf, ret, "[");
+
+		ret += sysfs_emit_at(buf, ret, "%s", btrfs_read_policy_name[i]);
+
+		if (i == policy)
+			ret += sysfs_emit_at(buf, ret, "]");
 	}
 
 	ret += sysfs_emit_at(buf, ret, "\n");
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 03/10] btrfs: add btrfs_read_policy_to_enum helper and refactor read policy store
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
  2024-11-15 14:54 ` [PATCH v3 01/10] btrfs: initialize fs_devices->fs_info earlier Anand Jain
  2024-11-15 14:54 ` [PATCH v3 02/10] btrfs: simplify output formatting in btrfs_read_policy_show Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-12-06 16:47   ` David Sterba
  2024-11-15 14:54 ` [PATCH v3 04/10] btrfs: handle value associated with raid1 balancing parameter Anand Jain
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

Introduce the `btrfs_read_policy_to_enum` helper function to simplify the
conversion of a string read policy to its corresponding enum value. This
reduces duplication and improves code clarity in `btrfs_read_policy_store`.
The `btrfs_read_policy_store` function has been refactored to use the new
helper.

The parameter is copied locally to allow modification, enabling the
separation of the method and its value. This prepares for the addition of
more functionality in subsequent patches.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c | 50 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 38 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index fd3c49c6c3c5..7506818ec45f 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1307,6 +1307,34 @@ BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
 
 static const char * const btrfs_read_policy_name[] = { "pid" };
 
+static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str)
+{
+	bool found = false;
+	enum btrfs_read_policy index;
+	char *param;
+
+	if (!str || !strlen(str))
+		return 0;
+
+	param = kstrdup(str, GFP_KERNEL);
+	if (!param)
+		return -ENOMEM;
+
+	for (index = 0; index < BTRFS_NR_READ_POLICY; index++) {
+		if (sysfs_streq(param, btrfs_read_policy_name[index])) {
+			found = true;
+			break;
+		}
+	}
+
+	kfree(param);
+
+	if (found)
+		return index;
+
+	return -EINVAL;
+}
+
 static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 				      struct kobj_attribute *a, char *buf)
 {
@@ -1338,21 +1366,19 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
 				       const char *buf, size_t len)
 {
 	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
-	int i;
+	enum btrfs_read_policy index;
 
-	for (i = 0; i < BTRFS_NR_READ_POLICY; i++) {
-		if (sysfs_streq(buf, btrfs_read_policy_name[i])) {
-			if (i != READ_ONCE(fs_devices->read_policy)) {
-				WRITE_ONCE(fs_devices->read_policy, i);
-				btrfs_info(fs_devices->fs_info,
-					   "read policy set to '%s'",
-					   btrfs_read_policy_name[i]);
-			}
-			return len;
-		}
+	index = btrfs_read_policy_to_enum(buf);
+	if (index == -EINVAL)
+		return -EINVAL;
+
+	if (index != READ_ONCE(fs_devices->read_policy)) {
+		WRITE_ONCE(fs_devices->read_policy, index);
+		btrfs_info(fs_devices->fs_info, "read policy set to '%s'",
+			   btrfs_read_policy_name[index]);
 	}
 
-	return -EINVAL;
+	return len;
 }
 BTRFS_ATTR_RW(, read_policy, btrfs_read_policy_show, btrfs_read_policy_store);
 
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 04/10] btrfs: handle value associated with raid1 balancing parameter
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (2 preceding siblings ...)
  2024-11-15 14:54 ` [PATCH v3 03/10] btrfs: add btrfs_read_policy_to_enum helper and refactor read policy store Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-12-06 16:53   ` David Sterba
  2024-11-15 14:54 ` [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing Anand Jain
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

This change enables specifying additional configuration values alongside
the raid1 balancing / read policy in a single input string.

Updated btrfs_read_policy_to_enum() to parse and handle a value associated
with the policy in the format `policy:value`, the value part if present is
converted 64-bit integer. Update btrfs_read_policy_store() to accommodate
the new parameter.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 7506818ec45f..7907507b8ced 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1307,8 +1307,11 @@ BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
 
 static const char * const btrfs_read_policy_name[] = { "pid" };
 
-static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str)
+static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value)
 {
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+	char *value_str;
+#endif
 	bool found = false;
 	enum btrfs_read_policy index;
 	char *param;
@@ -1320,6 +1323,18 @@ static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str)
 	if (!param)
 		return -ENOMEM;
 
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+	/* Separate value from input in policy:value format. */
+	if ((value_str = strchr(param, ':'))) {
+		*value_str = '\0';
+		value_str++;
+		if (value && kstrtou64(value_str, 10, value) != 0) {
+			kfree(param);
+			return -EINVAL;
+		}
+	}
+#endif
+
 	for (index = 0; index < BTRFS_NR_READ_POLICY; index++) {
 		if (sysfs_streq(param, btrfs_read_policy_name[index])) {
 			found = true;
@@ -1367,8 +1382,9 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
 {
 	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
 	enum btrfs_read_policy index;
+	s64 value = -1;
 
-	index = btrfs_read_policy_to_enum(buf);
+	index = btrfs_read_policy_to_enum(buf, &value);
 	if (index == -EINVAL)
 		return -EINVAL;
 
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (3 preceding siblings ...)
  2024-11-15 14:54 ` [PATCH v3 04/10] btrfs: handle value associated with raid1 balancing parameter Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-11-18  7:34   ` kernel test robot
  2024-12-06 17:13   ` David Sterba
  2024-11-15 14:54 ` [PATCH v3 06/10] btrfs: add RAID1 preferred read device Anand Jain
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

This feature balances I/O across the striped devices when reading from
RAID1 blocks.

   echo round-robin:[min_contiguous_read] > /sys/fs/btrfs/<uuid>/read_policy

The min_contiguous_read parameter defines the minimum read size before
switching to the next mirrored device. This setting is optional, with a
default value of 256 KiB.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c   | 38 +++++++++++++++++++++++++++++
 fs/btrfs/volumes.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.h |  9 +++++++
 3 files changed, 107 insertions(+)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 7907507b8ced..092a78298d1a 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1305,7 +1305,11 @@ static ssize_t btrfs_temp_fsid_show(struct kobject *kobj,
 }
 BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
 
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+static const char * const btrfs_read_policy_name[] = { "pid", "round-robin" };
+#else
 static const char * const btrfs_read_policy_name[] = { "pid" };
+#endif
 
 static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value)
 {
@@ -1367,6 +1371,12 @@ static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 
 		ret += sysfs_emit_at(buf, ret, "%s", btrfs_read_policy_name[i]);
 
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+		if (i == BTRFS_READ_POLICY_RR)
+			ret += sysfs_emit_at(buf, ret, ":%d",
+					     fs_devices->min_contiguous_read);
+#endif
+
 		if (i == policy)
 			ret += sysfs_emit_at(buf, ret, "]");
 	}
@@ -1388,6 +1398,34 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
 	if (index == -EINVAL)
 		return -EINVAL;
 
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+	if (index == BTRFS_READ_POLICY_RR) {
+		if (value != -1) {
+			if ((value % fs_devices->fs_info->sectorsize) != 0) {
+				btrfs_err(fs_devices->fs_info,
+"read_policy: min_contiguous_read %lld should be multiples of the sectorsize %u",
+					  value, fs_devices->fs_info->sectorsize);
+				return -EINVAL;
+			}
+		} else {
+			/* value is not provided, set it to the default 256k */
+			value = 256 * 1024;
+		}
+
+		if (index != READ_ONCE(fs_devices->read_policy) ||
+		    value != READ_ONCE(fs_devices->min_contiguous_read)) {
+			WRITE_ONCE(fs_devices->read_policy, index);
+			WRITE_ONCE(fs_devices->min_contiguous_read, value);
+			atomic_set(&fs_devices->total_reads, 0);
+
+			btrfs_info(fs_devices->fs_info, "read policy set to '%s:%lld'",
+				   btrfs_read_policy_name[index], value);
+
+		}
+
+		return len;
+	}
+#endif
 	if (index != READ_ONCE(fs_devices->read_policy)) {
 		WRITE_ONCE(fs_devices->read_policy, index);
 		btrfs_info(fs_devices->fs_info, "read policy set to '%s'",
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index fe5ceea2ba0b..97576a715191 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1328,6 +1328,10 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	fs_devices->total_rw_bytes = 0;
 	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
 	fs_devices->read_policy = BTRFS_READ_POLICY_PID;
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+	/* Set min_contiguous_read to a default 256kib */
+	fs_devices->min_contiguous_read = 256 * 1024;
+#endif
 
 	return 0;
 }
@@ -5959,6 +5963,57 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info *fs_info,
 	return len;
 }
 
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+struct stripe_mirror {
+	u64 devid;
+	int num;
+};
+
+static int btrfs_cmp_devid(const void *a, const void *b)
+{
+	struct stripe_mirror *s1 = (struct stripe_mirror *)a;
+	struct stripe_mirror *s2 = (struct stripe_mirror *)b;
+
+	if (s1->devid < s2->devid)
+		return -1;
+	if (s1->devid > s2->devid)
+		return 1;
+	return 0;
+}
+
+static int btrfs_read_rr(struct btrfs_chunk_map *map, int first, int num_stripe)
+{
+	struct stripe_mirror stripes[4] = {0}; //4: max possible mirrors
+	struct btrfs_fs_devices *fs_devices;
+	struct btrfs_device *device;
+	int j;
+	int read_cycle;
+	int index;
+	int ret_stripe;
+	int total_reads;
+	int reads_per_dev = 0;
+
+	device = map->stripes[first].dev;
+
+	fs_devices = device->fs_devices;
+	reads_per_dev = fs_devices->min_contiguous_read/fs_devices->fs_info->sectorsize;
+	index = 0;
+	for (j = first; j < first + num_stripe; j++) {
+		stripes[index].devid = map->stripes[j].dev->devid;
+		stripes[index].num = j;
+		index++;
+	}
+	sort(stripes, num_stripe, sizeof(struct stripe_mirror),
+	     btrfs_cmp_devid, NULL);
+
+	total_reads = atomic_inc_return(&fs_devices->total_reads);
+	read_cycle = total_reads/reads_per_dev;
+	ret_stripe = stripes[read_cycle % num_stripe].num;
+
+	return ret_stripe;
+}
+#endif
+
 static int find_live_mirror(struct btrfs_fs_info *fs_info,
 			    struct btrfs_chunk_map *map, int first,
 			    int dev_replace_is_ongoing)
@@ -5988,6 +6043,11 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
 	case BTRFS_READ_POLICY_PID:
 		preferred_mirror = first + (current->pid % num_stripes);
 		break;
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+	case BTRFS_READ_POLICY_RR:
+		preferred_mirror = btrfs_read_rr(map, first, num_stripes);
+		break;
+#endif
 	}
 
 	if (dev_replace_is_ongoing &&
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 3a416b1bc24c..05778361c270 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -303,6 +303,10 @@ enum btrfs_chunk_allocation_policy {
 enum btrfs_read_policy {
 	/* Use process PID to choose the stripe */
 	BTRFS_READ_POLICY_PID,
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+	/* Balancing raid1 reads across all striped devices (round-robin) */
+	BTRFS_READ_POLICY_RR,
+#endif
 	BTRFS_NR_READ_POLICY,
 };
 
@@ -431,6 +435,11 @@ struct btrfs_fs_devices {
 	enum btrfs_read_policy read_policy;
 
 #ifdef CONFIG_BTRFS_EXPERIMENTAL
+	/* IO stat, read counter. */
+	atomic_t total_reads;
+	/* Min contiguous reads before switching to next device. */
+	int min_contiguous_read;
+
 	/* Checksum mode - offload it or do it synchronously. */
 	enum btrfs_offload_csum_mode offload_csum_mode;
 #endif
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 06/10] btrfs: add RAID1 preferred read device
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (4 preceding siblings ...)
  2024-11-15 14:54 ` [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-11-15 14:54 ` [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status Anand Jain
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

When there's stale data on a mirrored device, this feature lets you choose
which device to read from. Mainly used for testing.

echo "devid:<devid-value>" > /sys/fs/btrfs/<UUID>/read_policy

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c   | 34 ++++++++++++++++++++++++++++++++--
 fs/btrfs/volumes.c | 21 +++++++++++++++++++++
 fs/btrfs/volumes.h |  5 +++++
 3 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 092a78298d1a..96d0480d1b9e 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1306,7 +1306,7 @@ static ssize_t btrfs_temp_fsid_show(struct kobject *kobj,
 BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
 
 #ifdef CONFIG_BTRFS_EXPERIMENTAL
-static const char * const btrfs_read_policy_name[] = { "pid", "round-robin" };
+static const char * const btrfs_read_policy_name[] = { "pid", "round-robin", "devid" };
 #else
 static const char * const btrfs_read_policy_name[] = { "pid" };
 #endif
@@ -1375,8 +1375,11 @@ static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 		if (i == BTRFS_READ_POLICY_RR)
 			ret += sysfs_emit_at(buf, ret, ":%d",
 					     fs_devices->min_contiguous_read);
-#endif
 
+		if (i == BTRFS_READ_POLICY_DEVID)
+			ret += sysfs_emit_at(buf, ret, ":%llu",
+							fs_devices->read_devid);
+#endif
 		if (i == policy)
 			ret += sysfs_emit_at(buf, ret, "]");
 	}
@@ -1425,6 +1428,33 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
 
 		return len;
 	}
+
+	if (index == BTRFS_READ_POLICY_DEVID) {
+
+		if (value != -1) {
+			BTRFS_DEV_LOOKUP_ARGS(args);
+
+			/* Validate input devid */
+			args.devid = value;
+			if (btrfs_find_device(fs_devices, &args) == NULL)
+				return -EINVAL;
+		} else {
+			/* Set default devid to the devid of the latest device */
+			value = fs_devices->latest_dev->devid;
+		}
+
+		if (index != READ_ONCE(fs_devices->read_policy) ||
+		    (value != READ_ONCE(fs_devices->read_devid))) {
+			WRITE_ONCE(fs_devices->read_policy, index);
+			WRITE_ONCE(fs_devices->read_devid, value);
+
+			btrfs_info(fs_devices->fs_info, "read policy set to '%s:%llu'",
+				   btrfs_read_policy_name[index], value);
+
+		}
+
+		return len;
+	}
 #endif
 	if (index != READ_ONCE(fs_devices->read_policy)) {
 		WRITE_ONCE(fs_devices->read_policy, index);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 97576a715191..90d84fb664aa 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1331,6 +1331,7 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 #ifdef CONFIG_BTRFS_EXPERIMENTAL
 	/* Set min_contiguous_read to a default 256kib */
 	fs_devices->min_contiguous_read = 256 * 1024;
+	fs_devices->read_devid = latest_dev->devid;
 #endif
 
 	return 0;
@@ -5964,6 +5965,23 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info *fs_info,
 }
 
 #ifdef CONFIG_BTRFS_EXPERIMENTAL
+static int btrfs_read_preferred(struct btrfs_chunk_map *map, int first,
+				int num_stripe)
+{
+	int last = first + num_stripe;
+	int stripe_index;
+
+	for (stripe_index = first; stripe_index < last; stripe_index++) {
+		struct btrfs_device *device = map->stripes[stripe_index].dev;
+
+		if (device->devid == READ_ONCE(device->fs_devices->read_devid))
+			return stripe_index;
+	}
+
+	/* If no read-preferred device, use first stripe */
+	return first;
+}
+
 struct stripe_mirror {
 	u64 devid;
 	int num;
@@ -6047,6 +6065,9 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
 	case BTRFS_READ_POLICY_RR:
 		preferred_mirror = btrfs_read_rr(map, first, num_stripes);
 		break;
+	case BTRFS_READ_POLICY_DEVID:
+		preferred_mirror = btrfs_read_preferred(map, first, num_stripes);
+		break;
 #endif
 	}
 
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 05778361c270..3b7ba202b169 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -306,6 +306,8 @@ enum btrfs_read_policy {
 #ifdef CONFIG_BTRFS_EXPERIMENTAL
 	/* Balancing raid1 reads across all striped devices (round-robin) */
 	BTRFS_READ_POLICY_RR,
+	/* Read from the specific device */
+	BTRFS_READ_POLICY_DEVID,
 #endif
 	BTRFS_NR_READ_POLICY,
 };
@@ -440,6 +442,9 @@ struct btrfs_fs_devices {
 	/* Min contiguous reads before switching to next device. */
 	int min_contiguous_read;
 
+	/* Device to be used for reading in case of RAID1. */
+	u64 read_devid;
+
 	/* Checksum mode - offload it or do it synchronously. */
 	enum btrfs_offload_csum_mode offload_csum_mode;
 #endif
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (5 preceding siblings ...)
  2024-11-15 14:54 ` [PATCH v3 06/10] btrfs: add RAID1 preferred read device Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-12-06 17:16   ` David Sterba
  2024-12-06 17:18   ` David Sterba
  2024-11-15 14:54 ` [PATCH v3 08/10] btrfs: fix CONFIG_BTRFS_EXPERIMENTAL migration Anand Jain
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

Commit c9c49e8f157e ("btrfs: split out CONFIG_BTRFS_EXPERIMENTAL from
CONFIG_BTRFS_DEBUG") introduces a way to enable or disable experimental
features, print its status during module load, like so:

Btrfs loaded, debug=on, assert=on, zoned=yes, fsverity=yes

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/super.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 6cc9291c4552..d52f7f6e2de7 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2487,6 +2487,11 @@ static int __init btrfs_print_mod_info(void)
 			", fsverity=yes"
 #else
 			", fsverity=no"
+#endif
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+			", experimental=yes"
+#else
+			", experimental=no"
 #endif
 			;
 	pr_info("Btrfs loaded%s\n", options);
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 08/10] btrfs: fix CONFIG_BTRFS_EXPERIMENTAL migration
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (6 preceding siblings ...)
  2024-11-15 14:54 ` [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-11-15 14:54 ` [PATCH v3 09/10] btrfs: enable RAID1 balancing configuration via modprobe parameter Anand Jain
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

Commit c9c49e8f157e ("btrfs: split out CONFIG_BTRFS_EXPERIMENTAL from
CONFIG_BTRFS_DEBUG") migrated some features from CONFIG_BTRFS_DEBUG to
CONFIG_BTRFS_EXPERIMENTAL. We could also move the corresponding sysfs
entries for these features, as there is no point in retaining the sysfs
interfaces once the feature is moved.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/sysfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 96d0480d1b9e..50b8b8847dd4 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -295,7 +295,7 @@ BTRFS_FEAT_ATTR_INCOMPAT(simple_quota, SIMPLE_QUOTA);
 #ifdef CONFIG_BLK_DEV_ZONED
 BTRFS_FEAT_ATTR_INCOMPAT(zoned, ZONED);
 #endif
-#ifdef CONFIG_BTRFS_DEBUG
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
 /* Remove once support for extent tree v2 is feature complete */
 BTRFS_FEAT_ATTR_INCOMPAT(extent_tree_v2, EXTENT_TREE_V2);
 /* Remove once support for raid stripe tree is feature complete. */
@@ -329,7 +329,7 @@ static struct attribute *btrfs_supported_feature_attrs[] = {
 #ifdef CONFIG_BLK_DEV_ZONED
 	BTRFS_FEAT_ATTR_PTR(zoned),
 #endif
-#ifdef CONFIG_BTRFS_DEBUG
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
 	BTRFS_FEAT_ATTR_PTR(extent_tree_v2),
 	BTRFS_FEAT_ATTR_PTR(raid_stripe_tree),
 #endif
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 09/10] btrfs: enable RAID1 balancing configuration via modprobe parameter
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (7 preceding siblings ...)
  2024-11-15 14:54 ` [PATCH v3 08/10] btrfs: fix CONFIG_BTRFS_EXPERIMENTAL migration Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-11-15 14:54 ` [PATCH v3 10/10] btrfs: modload to print RAID1 balancing status Anand Jain
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

This update allows configuring the `raid1-balancing` methods using a
modprobe parameter when experimental mode CONFIG_BTRFS_EXPERIMENTAL
is enabled.

Examples:

- Set the RAID1 balancing method to round-robin with a custom
`min_contiguous_read` of 192k:
  $ modprobe btrfs raid1-balancing=round-robin:196608

- Set the round-robin balancing method with the default
`min_contiguous_read` of 256k:
  $ modprobe btrfs raid1-balancing=round-robin

- Set the `devid` balancing method, defaulting to the latest
device:
  $ modprobe btrfs raid1-balancing=devid

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/super.c   |  5 +++++
 fs/btrfs/sysfs.c   | 29 +++++++++++++++++++++++++++++
 fs/btrfs/sysfs.h   |  5 +++++
 fs/btrfs/volumes.c |  5 ++++-
 4 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d52f7f6e2de7..ecadc8e0dcfb 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2552,6 +2552,11 @@ static const struct init_sequence mod_init_seq[] = {
 	}, {
 		.init_func = extent_map_init,
 		.exit_func = extent_map_exit,
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+	}, {
+		.init_func = btrfs_raid1_balancing_init,
+		.exit_func = NULL,
+#endif
 	}, {
 		.init_func = ordered_data_init,
 		.exit_func = ordered_data_exit,
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 50b8b8847dd4..50bc4b6cb821 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1307,11 +1307,28 @@ BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
 
 #ifdef CONFIG_BTRFS_EXPERIMENTAL
 static const char * const btrfs_read_policy_name[] = { "pid", "round-robin", "devid" };
+
+/* Global module configuration parameters */
+static char *raid1_balancing;
+char *btrfs_get_raid1_balancing(void)
+{
+	return raid1_balancing;
+}
+
+/* Set perm 0, disable sys/module/btrfs/parameter/raid1_balancing interface */
+module_param(raid1_balancing, charp, 0);
+MODULE_PARM_DESC(raid1_balancing,
+"Global read policy; pid (default), round-robin:[min_contiguous_read], devid:[[devid]|[latest-gen]|[oldest-gen]]");
+
 #else
 static const char * const btrfs_read_policy_name[] = { "pid" };
 #endif
 
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value)
+#else
 static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value)
+#endif
 {
 #ifdef CONFIG_BTRFS_EXPERIMENTAL
 	char *value_str;
@@ -1354,6 +1371,18 @@ static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *va
 	return -EINVAL;
 }
 
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+int __init btrfs_raid1_balancing_init(void)
+{
+	if (btrfs_read_policy_to_enum(raid1_balancing, NULL) == -EINVAL) {
+		btrfs_err(NULL, "Invalid raid1_balancing %s", raid1_balancing);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+#endif
+
 static ssize_t btrfs_read_policy_show(struct kobject *kobj,
 				      struct kobj_attribute *a, char *buf)
 {
diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
index e6a284c59809..f0fb0c0c2f7d 100644
--- a/fs/btrfs/sysfs.h
+++ b/fs/btrfs/sysfs.h
@@ -47,5 +47,10 @@ void btrfs_sysfs_del_qgroups(struct btrfs_fs_info *fs_info);
 int btrfs_sysfs_add_qgroups(struct btrfs_fs_info *fs_info);
 void btrfs_sysfs_del_one_qgroup(struct btrfs_fs_info *fs_info,
 				struct btrfs_qgroup *qgroup);
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value);
+int __init btrfs_raid1_balancing_init(void);
+char *btrfs_get_raid1_balancing(void);
+#endif
 
 #endif
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 90d84fb664aa..2b31fdffea08 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1327,11 +1327,14 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
 	fs_devices->latest_dev = latest_dev;
 	fs_devices->total_rw_bytes = 0;
 	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
-	fs_devices->read_policy = BTRFS_READ_POLICY_PID;
 #ifdef CONFIG_BTRFS_EXPERIMENTAL
 	/* Set min_contiguous_read to a default 256kib */
 	fs_devices->min_contiguous_read = 256 * 1024;
 	fs_devices->read_devid = latest_dev->devid;
+	fs_devices->read_policy =
+		btrfs_read_policy_to_enum(btrfs_get_raid1_balancing(), NULL);
+#else
+	fs_devices->read_policy = BTRFS_READ_POLICY_PID;
 #endif
 
 	return 0;
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 10/10] btrfs: modload to print RAID1 balancing status
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (8 preceding siblings ...)
  2024-11-15 14:54 ` [PATCH v3 09/10] btrfs: enable RAID1 balancing configuration via modprobe parameter Anand Jain
@ 2024-11-15 14:54 ` Anand Jain
  2024-11-15 15:20 ` [PATCH 0/2] fstests: refine filesystem module reload handling Anand Jain
  2024-11-15 19:16 ` [PATCH v3 00/10] raid1 balancing methods Filipe Manana
  11 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 14:54 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, wqu, hrx, waxhead

Modified the Btrfs loading message to include the RAID1 balancing status
if the experimental feature is enabled.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/super.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index ecadc8e0dcfb..8eb5da5da693 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2494,7 +2494,17 @@ static int __init btrfs_print_mod_info(void)
 			", experimental=no"
 #endif
 			;
-	pr_info("Btrfs loaded%s\n", options);
+
+#ifdef CONFIG_BTRFS_EXPERIMENTAL
+	if (btrfs_get_raid1_balancing() == NULL)
+		pr_info("Btrfs loaded%s\n", options);
+	else
+		pr_info("Btrfs loaded%s, raid1_balancing=%s\n",
+			 options, btrfs_get_raid1_balancing());
+#else
+		pr_info("Btrfs loaded%s\n", options);
+#endif
+
 	return 0;
 }
 
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 0/2] fstests: refine filesystem module reload handling
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (9 preceding siblings ...)
  2024-11-15 14:54 ` [PATCH v3 10/10] btrfs: modload to print RAID1 balancing status Anand Jain
@ 2024-11-15 15:20 ` Anand Jain
  2024-11-15 15:20   ` [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function Anand Jain
  2024-11-15 15:20   ` [PATCH 2/2] fstests: FS_MODULE_RELOAD_OPTIONS to control filesystem module reload options Anand Jain
  2024-11-15 19:16 ` [PATCH v3 00/10] raid1 balancing methods Filipe Manana
  11 siblings, 2 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 15:20 UTC (permalink / raw)
  To: fstests; +Cc: linux-btrfs

These changes refine module reload control in testing. Patch 1 reloads
the module earlier in run_section, ensuring each section's first test
starts with a reloaded module. Patch 2 adds FS_MODULE_RELOAD_OPTIONS to
pass module options during reload, useful for the Btrfs pre-mount
configurations that aren’t available as mount options.

Anand Jain (2):
  fstests: move fs-module reload to earlier in the run_section function
  fstests: FS_MODULE_RELOAD_OPTIONS to control filesystem module reload
    options

 check | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

-- 
2.46.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function
  2024-11-15 15:20 ` [PATCH 0/2] fstests: refine filesystem module reload handling Anand Jain
@ 2024-11-15 15:20   ` Anand Jain
  2024-11-15 16:39     ` Darrick J. Wong
  2024-11-15 16:48     ` Filipe Manana
  2024-11-15 15:20   ` [PATCH 2/2] fstests: FS_MODULE_RELOAD_OPTIONS to control filesystem module reload options Anand Jain
  1 sibling, 2 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-15 15:20 UTC (permalink / raw)
  To: fstests; +Cc: linux-btrfs

Reload the module before each test, instead of later in run_section.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 check | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/check b/check
index 9222cd7e4f81..d8ee73f48c77 100755
--- a/check
+++ b/check
@@ -935,6 +935,15 @@ function run_section()
 			continue
 		fi
 
+		# Reload the module after each test to check for leaks or
+		# other problems.
+		if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
+			_test_unmount 2> /dev/null
+			_scratch_unmount 2> /dev/null
+			modprobe -r fs-$FSTYP
+			modprobe fs-$FSTYP
+		fi
+
 		# record that we really tried to run this test.
 		if ((!${#loop_status[*]})); then
 			try+=("$seqnum")
@@ -1033,15 +1042,6 @@ function run_section()
 			done
 		fi
 
-		# Reload the module after each test to check for leaks or
-		# other problems.
-		if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
-			_test_unmount 2> /dev/null
-			_scratch_unmount 2> /dev/null
-			modprobe -r fs-$FSTYP
-			modprobe fs-$FSTYP
-		fi
-
 		# Scan for memory leaks after every test so that associating
 		# a leak to a particular test will be as accurate as possible.
 		_check_kmemleak || tc_status="fail"
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/2] fstests: FS_MODULE_RELOAD_OPTIONS to control filesystem module reload options
  2024-11-15 15:20 ` [PATCH 0/2] fstests: refine filesystem module reload handling Anand Jain
  2024-11-15 15:20   ` [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function Anand Jain
@ 2024-11-15 15:20   ` Anand Jain
  2024-11-15 16:34     ` Darrick J. Wong
  1 sibling, 1 reply; 29+ messages in thread
From: Anand Jain @ 2024-11-15 15:20 UTC (permalink / raw)
  To: fstests; +Cc: linux-btrfs

Extend module reload logic to allow passing additional options via
`FS_MODULE_RELOAD_OPTIONS`. This enhancement enables more flexible
configuration during module reloads, which can be useful for testing
specific module parameters.

Maintains existing behavior for `TEST_FS_MODULE_RELOAD`.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 check | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/check b/check
index d8ee73f48c77..ced86466a4bb 100755
--- a/check
+++ b/check
@@ -937,11 +937,12 @@ function run_section()
 
 		# Reload the module after each test to check for leaks or
 		# other problems.
-		if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
+		if [[ -n "${TEST_FS_MODULE_RELOAD}" ||
+		      -n "${FS_MODULE_RELOAD_OPTIONS}" ]]; then
 			_test_unmount 2> /dev/null
 			_scratch_unmount 2> /dev/null
 			modprobe -r fs-$FSTYP
-			modprobe fs-$FSTYP
+			modprobe fs-$FSTYP ${FS_MODULE_RELOAD_OPTIONS}
 		fi
 
 		# record that we really tried to run this test.
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/2] fstests: FS_MODULE_RELOAD_OPTIONS to control filesystem module reload options
  2024-11-15 15:20   ` [PATCH 2/2] fstests: FS_MODULE_RELOAD_OPTIONS to control filesystem module reload options Anand Jain
@ 2024-11-15 16:34     ` Darrick J. Wong
  0 siblings, 0 replies; 29+ messages in thread
From: Darrick J. Wong @ 2024-11-15 16:34 UTC (permalink / raw)
  To: Anand Jain; +Cc: fstests, linux-btrfs

On Fri, Nov 15, 2024 at 11:20:52PM +0800, Anand Jain wrote:
> Extend module reload logic to allow passing additional options via
> `FS_MODULE_RELOAD_OPTIONS`. This enhancement enables more flexible
> configuration during module reloads, which can be useful for testing
> specific module parameters.

You might want to document the existence of this knob in the README.

> Maintains existing behavior for `TEST_FS_MODULE_RELOAD`.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  check | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/check b/check
> index d8ee73f48c77..ced86466a4bb 100755
> --- a/check
> +++ b/check
> @@ -937,11 +937,12 @@ function run_section()
>  
>  		# Reload the module after each test to check for leaks or
>  		# other problems.
> -		if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
> +		if [[ -n "${TEST_FS_MODULE_RELOAD}" ||
> +		      -n "${FS_MODULE_RELOAD_OPTIONS}" ]]; then
>  			_test_unmount 2> /dev/null
>  			_scratch_unmount 2> /dev/null
>  			modprobe -r fs-$FSTYP
> -			modprobe fs-$FSTYP
> +			modprobe fs-$FSTYP ${FS_MODULE_RELOAD_OPTIONS}

What happens if the test itself decides to reload the $FSTYP module?
Do you want these options to propagate to those reloadings as well?
AFAICT there are some btrfs/ group tests that do such things.

(Perhaps you'd be better off injecting FS_MODULE_RELOAD_OPTIONS into
/etc/modprobe.d/<somefile> to catch all these cases?)

--D

>  		fi
>  
>  		# record that we really tried to run this test.
> -- 
> 2.46.1
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function
  2024-11-15 15:20   ` [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function Anand Jain
@ 2024-11-15 16:39     ` Darrick J. Wong
  2024-11-18  0:12       ` Anand Jain
  2024-11-15 16:48     ` Filipe Manana
  1 sibling, 1 reply; 29+ messages in thread
From: Darrick J. Wong @ 2024-11-15 16:39 UTC (permalink / raw)
  To: Anand Jain; +Cc: fstests, linux-btrfs

On Fri, Nov 15, 2024 at 11:20:51PM +0800, Anand Jain wrote:
> Reload the module before each test, instead of later in run_section.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  check | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/check b/check
> index 9222cd7e4f81..d8ee73f48c77 100755
> --- a/check
> +++ b/check
> @@ -935,6 +935,15 @@ function run_section()
>  			continue
>  		fi
>  
> +		# Reload the module after each test to check for leaks or
> +		# other problems.

Hrmm.  The nice thing about doing the reload /after/ each test is that
unloading the module will purge any per-fs slab caches that the module
creates.  The slab teardown logs any forgotten objects while $seq still
points to the test that leaked those objects.  Granted it's not a 100%
solution (that job falls to kmemcheck) but it's a cheap check.

This change makes it so that if test N leaks something, fstests reports
them for test N+1.

--D

> +		if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
> +			_test_unmount 2> /dev/null
> +			_scratch_unmount 2> /dev/null
> +			modprobe -r fs-$FSTYP
> +			modprobe fs-$FSTYP
> +		fi
> +
>  		# record that we really tried to run this test.
>  		if ((!${#loop_status[*]})); then
>  			try+=("$seqnum")
> @@ -1033,15 +1042,6 @@ function run_section()
>  			done
>  		fi
>  
> -		# Reload the module after each test to check for leaks or
> -		# other problems.
> -		if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
> -			_test_unmount 2> /dev/null
> -			_scratch_unmount 2> /dev/null
> -			modprobe -r fs-$FSTYP
> -			modprobe fs-$FSTYP
> -		fi
> -
>  		# Scan for memory leaks after every test so that associating
>  		# a leak to a particular test will be as accurate as possible.
>  		_check_kmemleak || tc_status="fail"
> -- 
> 2.46.1
> 
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function
  2024-11-15 15:20   ` [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function Anand Jain
  2024-11-15 16:39     ` Darrick J. Wong
@ 2024-11-15 16:48     ` Filipe Manana
  1 sibling, 0 replies; 29+ messages in thread
From: Filipe Manana @ 2024-11-15 16:48 UTC (permalink / raw)
  To: Anand Jain; +Cc: fstests, linux-btrfs

On Fri, Nov 15, 2024 at 3:22 PM Anand Jain <anand.jain@oracle.com> wrote:
>
> Reload the module before each test, instead of later in run_section.

Why?
The change log should give a justification for the change.
It's just stating what the code changes are doing but without any reasoning.

Thanks.

>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  check | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/check b/check
> index 9222cd7e4f81..d8ee73f48c77 100755
> --- a/check
> +++ b/check
> @@ -935,6 +935,15 @@ function run_section()
>                         continue
>                 fi
>
> +               # Reload the module after each test to check for leaks or
> +               # other problems.
> +               if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
> +                       _test_unmount 2> /dev/null
> +                       _scratch_unmount 2> /dev/null
> +                       modprobe -r fs-$FSTYP
> +                       modprobe fs-$FSTYP
> +               fi
> +
>                 # record that we really tried to run this test.
>                 if ((!${#loop_status[*]})); then
>                         try+=("$seqnum")
> @@ -1033,15 +1042,6 @@ function run_section()
>                         done
>                 fi
>
> -               # Reload the module after each test to check for leaks or
> -               # other problems.
> -               if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
> -                       _test_unmount 2> /dev/null
> -                       _scratch_unmount 2> /dev/null
> -                       modprobe -r fs-$FSTYP
> -                       modprobe fs-$FSTYP
> -               fi
> -
>                 # Scan for memory leaks after every test so that associating
>                 # a leak to a particular test will be as accurate as possible.
>                 _check_kmemleak || tc_status="fail"
> --
> 2.46.1
>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/10] raid1 balancing methods
  2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
                   ` (10 preceding siblings ...)
  2024-11-15 15:20 ` [PATCH 0/2] fstests: refine filesystem module reload handling Anand Jain
@ 2024-11-15 19:16 ` Filipe Manana
  2024-11-18  1:15   ` Anand Jain
  11 siblings, 1 reply; 29+ messages in thread
From: Filipe Manana @ 2024-11-15 19:16 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On Fri, Nov 15, 2024 at 2:55 PM Anand Jain <anand.jain@oracle.com> wrote:
>
> v3:
> 1. Removed the latency-based RAID1 balancing patch. (Per David's review)
> 2. Renamed "rotation" to "round-robin" and set the per-set
>    min_contiguous_read to 256k. (Per David's review)
> 3. Added raid1-balancing module configuration for fstests testing.
>    raid1-balancing can now be configured through both module load
>    parameters and sysfs.
>
> The logic of individual methods remains unchanged, and performance metrics
> are consistent with v2.
>
> -----
> v2:
> 1. Move new features to CONFIG_BTRFS_EXPERIMENTAL instead of CONFIG_BTRFS_DEBUG.
> 2. Correct the typo from %est_wait to %best_wait.
> 3. Initialize %best_wait to U64_MAX and remove the check for 0.
> 4. Implement rotation with a minimum contiguous read threshold before
>    switching to the next stripe. Configure this, using:
>
>         echo rotation:[min_contiguous_read] > /sys/fs/btrfs/<uuid>/read_policy
>
>    The default value is the sector size, and the min_contiguous_read
>    value must be a multiple of the sector size.
>
> 5. Tested FIO random read/write and defrag compression workloads with
>    min_contiguous_read set to sector size, 192k, and 256k.
>
>    RAID1 balancing method rotation is better for multi-process workloads
>    such as fio and also single-process workload such as defragmentation.
>
>      $ fio --filename=/btrfs/foo --size=5Gi --direct=1 --rw=randrw --bs=4k \
>         --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 \
>         --time_based --group_reporting --name=iops-test-job --eta-newline=1
>
>
> |         |            |            | Read I/O count  |
> |         | Read       | Write      | devid1 | devid2 |
> |---------|------------|------------|--------|--------|
> | pid     | 20.3MiB/s  | 20.5MiB/s  | 313895 | 313895 |
> | rotation|            |            |        |        |
> |     4096| 20.4MiB/s  | 20.5MiB/s  | 313895 | 313895 |
> |   196608| 20.2MiB/s  | 20.2MiB/s  | 310152 | 310175 |
> |   262144| 20.3MiB/s  | 20.4MiB/s  | 312180 | 312191 |
> |  latency| 18.4MiB/s  | 18.4MiB/s  | 272980 | 291683 |
> | devid:1 | 14.8MiB/s  | 14.9MiB/s  | 456376 | 0      |
>
>    rotation RAID1 balancing technique performs more than 2x better for
>    single-process defrag.
>
>       $ time -p btrfs filesystem defrag -r -f -c /btrfs
>
>
> |         | Time  | Read I/O Count  |
> |         | Real  | devid1 | devid2 |
> |---------|-------|--------|--------|
> | pid     | 18.00s| 3800   | 0      |
> | rotation|       |        |        |
> |     4096|  8.95s| 1900   | 1901   |
> |   196608|  8.50s| 1881   | 1919   |
> |   262144|  8.80s| 1881   | 1919   |
> | latency | 17.18s| 3800   | 0      |
> | devid:2 | 17.48s| 0      | 3800   |
>
> Rotation keeps all devices active, and for now, the Rotation RAID1
> balancing method is preferable as default. More workload testing is
> needed while the code is EXPERIMENTAL.
> While Latency is better during the failing/unstable block layer transport.
> As of now these two techniques, are needed to be further independently
> tested with different worloads, and in the long term we should be merge
> these technique to a unified heuristic.
>
> Rotation keeps all devices active, and for now, the Rotation RAID1
> balancing method should be the default. More workload testing is needed
> while the code is EXPERIMENTAL.
>
> Latency is smarter with unstable block layer transport.
>
> Both techniques need independent testing across workloads, with the goal of
> eventually merging them into a unified approach? for the long term.
>
> Devid is a hands-on approach, provides manual or user-space script control.
>
> These RAID1 balancing methods are tunable via the sysfs knob.
> The mount -o option and btrfs properties are under consideration.
>
> Thx.
>
> --------- original v1 ------------
>
> The RAID1-balancing methods helps distribute read I/O across devices, and
> this patch introduces three balancing methods: rotation, latency, and
> devid. These methods are enabled under the `CONFIG_BTRFS_DEBUG` config
> option and are on top of the previously added
> `/sys/fs/btrfs/<UUID>/read_policy` interface to configure the desired
> RAID1 read balancing method.
>
> I've tested these patches using fio and filesystem defragmentation
> workloads on a two-device RAID1 setup (with both data and metadata
> mirrored across identical devices). I tracked device read counts by
> extracting stats from `/sys/devices/<..>/stat` for each device. Below is
> a summary of the results, with each result the average of three
> iterations.
>
> A typical generic random rw workload:
>
> $ fio --filename=/btrfs/foo --size=10Gi --direct=1 --rw=randrw --bs=4k \
>   --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 --time_based \
>   --group_reporting --name=iops-test-job --eta-newline=1
>
> |         |            |            | Read I/O count  |
> |         | Read       | Write      | devid1 | devid2 |
> |---------|------------|------------|--------|--------|
> | pid     | 29.4MiB/s  | 29.5MiB/s  | 456548 | 447975 |
> | rotation| 29.3MiB/s  | 29.3MiB/s  | 450105 | 450055 |
> | latency | 21.9MiB/s  | 21.9MiB/s  | 672387 | 0      |
> | devid:1 | 22.0MiB/s  | 22.0MiB/s  | 674788 | 0      |
>
> Defragmentation with compression workload:
>
> $ xfs_io -f -d -c 'pwrite -S 0xab 0 1G' /btrfs/foo
> $ sync
> $ echo 3 > /proc/sys/vm/drop_caches
> $ btrfs filesystem defrag -f -c /btrfs/foo
>
> |         | Time  | Read I/O Count  |
> |         | Real  | devid1 | devid2 |
> |---------|-------|--------|--------|
> | pid     | 21.61s| 3810   | 0      |
> | rotation| 11.55s| 1905   | 1905   |
> | latency | 20.99s| 0      | 3810   |
> | devid:2 | 21.41s| 0      | 3810   |
>
> . The PID-based balancing method works well for the generic random rw fio
>   workload.
> . The rotation method is ideal when you want to keep both devices active,
>   and it boosts performance in sequential defragmentation scenarios.
> . The latency-based method work well when we have mixed device types or
>   when one device experiences intermittent I/O failures the latency
>   increases and it automatically picks the other device for further Read
>   IOs.
> . The devid method is a more hands-on approach, useful for diagnosing and
>   testing RAID1 mirror synchronizations.
>
> Anand Jain (10):
>   btrfs: initialize fs_devices->fs_info earlier
>   btrfs: simplify output formatting in btrfs_read_policy_show
>   btrfs: add btrfs_read_policy_to_enum helper and refactor read policy
>     store
>   btrfs: handle value associated with raid1 balancing parameter
>   btrfs: introduce RAID1 round-robin read balancing
>   btrfs: add RAID1 preferred read device
>   btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status
>   btrfs: fix CONFIG_BTRFS_EXPERIMENTAL migration

Why are these two patches, which are fixes unrelated to the raid
balancing feature, in the middle of this patchset?
These should go as a separate patchset...

Also for the second patch, there's already a fix from yesterday and in for-next:

https://lore.kernel.org/linux-btrfs/c7b550091f427a79ec5a9aa6c5ac6b5efbdb4e8f.1731605782.git.fdmanana@suse.com/

Thanks.

>   btrfs: enable RAID1 balancing configuration via modprobe parameter
>   btrfs: modload to print RAID1 balancing status
>
>  fs/btrfs/disk-io.c |   1 +
>  fs/btrfs/super.c   |  22 +++++-
>  fs/btrfs/sysfs.c   | 181 ++++++++++++++++++++++++++++++++++++++++-----
>  fs/btrfs/sysfs.h   |   5 ++
>  fs/btrfs/volumes.c |  86 ++++++++++++++++++++-
>  fs/btrfs/volumes.h |  14 ++++
>  6 files changed, 286 insertions(+), 23 deletions(-)
>
> --
> 2.46.1
>
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function
  2024-11-15 16:39     ` Darrick J. Wong
@ 2024-11-18  0:12       ` Anand Jain
  0 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-18  0:12 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: fstests, linux-btrfs

On 16/11/24 00:39, Darrick J. Wong wrote:
> On Fri, Nov 15, 2024 at 11:20:51PM +0800, Anand Jain wrote:
>> Reload the module before each test, instead of later in run_section.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   check | 18 +++++++++---------
>>   1 file changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/check b/check
>> index 9222cd7e4f81..d8ee73f48c77 100755
>> --- a/check
>> +++ b/check
>> @@ -935,6 +935,15 @@ function run_section()
>>   			continue
>>   		fi
>>   
>> +		# Reload the module after each test to check for leaks or
>> +		# other problems.
> 
> Hrmm.  The nice thing about doing the reload /after/ each test is that
> unloading the module will purge any per-fs slab caches that the module
> creates.  The slab teardown logs any forgotten objects while $seq still
> points to the test that leaked those objects.  Granted it's not a 100%
> solution (that job falls to kmemcheck) but it's a cheap check.
> 
> This change makes it so that if test N leaks something, fstests reports
> them for test N+1.
> 

Agreed. It's better to report the issue at test N. I'll add this comment 
to the code in v2.

Thanks!
Anand

> --D
> 
>> +		if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
>> +			_test_unmount 2> /dev/null
>> +			_scratch_unmount 2> /dev/null
>> +			modprobe -r fs-$FSTYP
>> +			modprobe fs-$FSTYP
>> +		fi
>> +
>>   		# record that we really tried to run this test.
>>   		if ((!${#loop_status[*]})); then
>>   			try+=("$seqnum")
>> @@ -1033,15 +1042,6 @@ function run_section()
>>   			done
>>   		fi
>>   
>> -		# Reload the module after each test to check for leaks or
>> -		# other problems.
>> -		if [ -n "${TEST_FS_MODULE_RELOAD}" ]; then
>> -			_test_unmount 2> /dev/null
>> -			_scratch_unmount 2> /dev/null
>> -			modprobe -r fs-$FSTYP
>> -			modprobe fs-$FSTYP
>> -		fi
>> -
>>   		# Scan for memory leaks after every test so that associating
>>   		# a leak to a particular test will be as accurate as possible.
>>   		_check_kmemleak || tc_status="fail"
>> -- 
>> 2.46.1
>>
>>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/10] raid1 balancing methods
  2024-11-15 19:16 ` [PATCH v3 00/10] raid1 balancing methods Filipe Manana
@ 2024-11-18  1:15   ` Anand Jain
  0 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-11-18  1:15 UTC (permalink / raw)
  To: Filipe Manana; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On 16/11/24 03:16, Filipe Manana wrote:
> On Fri, Nov 15, 2024 at 2:55 PM Anand Jain <anand.jain@oracle.com> wrote:
>>
>> v3:
>> 1. Removed the latency-based RAID1 balancing patch. (Per David's review)
>> 2. Renamed "rotation" to "round-robin" and set the per-set
>>     min_contiguous_read to 256k. (Per David's review)
>> 3. Added raid1-balancing module configuration for fstests testing.
>>     raid1-balancing can now be configured through both module load
>>     parameters and sysfs.
>>
>> The logic of individual methods remains unchanged, and performance metrics
>> are consistent with v2.
>>
>> -----
>> v2:
>> 1. Move new features to CONFIG_BTRFS_EXPERIMENTAL instead of CONFIG_BTRFS_DEBUG.
>> 2. Correct the typo from %est_wait to %best_wait.
>> 3. Initialize %best_wait to U64_MAX and remove the check for 0.
>> 4. Implement rotation with a minimum contiguous read threshold before
>>     switching to the next stripe. Configure this, using:
>>
>>          echo rotation:[min_contiguous_read] > /sys/fs/btrfs/<uuid>/read_policy
>>
>>     The default value is the sector size, and the min_contiguous_read
>>     value must be a multiple of the sector size.
>>
>> 5. Tested FIO random read/write and defrag compression workloads with
>>     min_contiguous_read set to sector size, 192k, and 256k.
>>
>>     RAID1 balancing method rotation is better for multi-process workloads
>>     such as fio and also single-process workload such as defragmentation.
>>
>>       $ fio --filename=/btrfs/foo --size=5Gi --direct=1 --rw=randrw --bs=4k \
>>          --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 \
>>          --time_based --group_reporting --name=iops-test-job --eta-newline=1
>>
>>
>> |         |            |            | Read I/O count  |
>> |         | Read       | Write      | devid1 | devid2 |
>> |---------|------------|------------|--------|--------|
>> | pid     | 20.3MiB/s  | 20.5MiB/s  | 313895 | 313895 |
>> | rotation|            |            |        |        |
>> |     4096| 20.4MiB/s  | 20.5MiB/s  | 313895 | 313895 |
>> |   196608| 20.2MiB/s  | 20.2MiB/s  | 310152 | 310175 |
>> |   262144| 20.3MiB/s  | 20.4MiB/s  | 312180 | 312191 |
>> |  latency| 18.4MiB/s  | 18.4MiB/s  | 272980 | 291683 |
>> | devid:1 | 14.8MiB/s  | 14.9MiB/s  | 456376 | 0      |
>>
>>     rotation RAID1 balancing technique performs more than 2x better for
>>     single-process defrag.
>>
>>        $ time -p btrfs filesystem defrag -r -f -c /btrfs
>>
>>
>> |         | Time  | Read I/O Count  |
>> |         | Real  | devid1 | devid2 |
>> |---------|-------|--------|--------|
>> | pid     | 18.00s| 3800   | 0      |
>> | rotation|       |        |        |
>> |     4096|  8.95s| 1900   | 1901   |
>> |   196608|  8.50s| 1881   | 1919   |
>> |   262144|  8.80s| 1881   | 1919   |
>> | latency | 17.18s| 3800   | 0      |
>> | devid:2 | 17.48s| 0      | 3800   |
>>
>> Rotation keeps all devices active, and for now, the Rotation RAID1
>> balancing method is preferable as default. More workload testing is
>> needed while the code is EXPERIMENTAL.
>> While Latency is better during the failing/unstable block layer transport.
>> As of now these two techniques, are needed to be further independently
>> tested with different worloads, and in the long term we should be merge
>> these technique to a unified heuristic.
>>
>> Rotation keeps all devices active, and for now, the Rotation RAID1
>> balancing method should be the default. More workload testing is needed
>> while the code is EXPERIMENTAL.
>>
>> Latency is smarter with unstable block layer transport.
>>
>> Both techniques need independent testing across workloads, with the goal of
>> eventually merging them into a unified approach? for the long term.
>>
>> Devid is a hands-on approach, provides manual or user-space script control.
>>
>> These RAID1 balancing methods are tunable via the sysfs knob.
>> The mount -o option and btrfs properties are under consideration.
>>
>> Thx.
>>
>> --------- original v1 ------------
>>
>> The RAID1-balancing methods helps distribute read I/O across devices, and
>> this patch introduces three balancing methods: rotation, latency, and
>> devid. These methods are enabled under the `CONFIG_BTRFS_DEBUG` config
>> option and are on top of the previously added
>> `/sys/fs/btrfs/<UUID>/read_policy` interface to configure the desired
>> RAID1 read balancing method.
>>
>> I've tested these patches using fio and filesystem defragmentation
>> workloads on a two-device RAID1 setup (with both data and metadata
>> mirrored across identical devices). I tracked device read counts by
>> extracting stats from `/sys/devices/<..>/stat` for each device. Below is
>> a summary of the results, with each result the average of three
>> iterations.
>>
>> A typical generic random rw workload:
>>
>> $ fio --filename=/btrfs/foo --size=10Gi --direct=1 --rw=randrw --bs=4k \
>>    --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 --time_based \
>>    --group_reporting --name=iops-test-job --eta-newline=1
>>
>> |         |            |            | Read I/O count  |
>> |         | Read       | Write      | devid1 | devid2 |
>> |---------|------------|------------|--------|--------|
>> | pid     | 29.4MiB/s  | 29.5MiB/s  | 456548 | 447975 |
>> | rotation| 29.3MiB/s  | 29.3MiB/s  | 450105 | 450055 |
>> | latency | 21.9MiB/s  | 21.9MiB/s  | 672387 | 0      |
>> | devid:1 | 22.0MiB/s  | 22.0MiB/s  | 674788 | 0      |
>>
>> Defragmentation with compression workload:
>>
>> $ xfs_io -f -d -c 'pwrite -S 0xab 0 1G' /btrfs/foo
>> $ sync
>> $ echo 3 > /proc/sys/vm/drop_caches
>> $ btrfs filesystem defrag -f -c /btrfs/foo
>>
>> |         | Time  | Read I/O Count  |
>> |         | Real  | devid1 | devid2 |
>> |---------|-------|--------|--------|
>> | pid     | 21.61s| 3810   | 0      |
>> | rotation| 11.55s| 1905   | 1905   |
>> | latency | 20.99s| 0      | 3810   |
>> | devid:2 | 21.41s| 0      | 3810   |
>>
>> . The PID-based balancing method works well for the generic random rw fio
>>    workload.
>> . The rotation method is ideal when you want to keep both devices active,
>>    and it boosts performance in sequential defragmentation scenarios.
>> . The latency-based method work well when we have mixed device types or
>>    when one device experiences intermittent I/O failures the latency
>>    increases and it automatically picks the other device for further Read
>>    IOs.
>> . The devid method is a more hands-on approach, useful for diagnosing and
>>    testing RAID1 mirror synchronizations.
>>
>> Anand Jain (10):
>>    btrfs: initialize fs_devices->fs_info earlier
>>    btrfs: simplify output formatting in btrfs_read_policy_show
>>    btrfs: add btrfs_read_policy_to_enum helper and refactor read policy
>>      store
>>    btrfs: handle value associated with raid1 balancing parameter
>>    btrfs: introduce RAID1 round-robin read balancing
>>    btrfs: add RAID1 preferred read device


>>    btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status


>>    btrfs: fix CONFIG_BTRFS_EXPERIMENTAL migration

This patch can be dropped, as Filipe's patch below already fixes it.

> 
> Why are these two patches, which are fixes unrelated to the raid

This patchset includes the last patch, which depends on the patch
(btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status). I would prefer to
keep it here.
But the question is there a more effective way to indicate
experimental features with the raid1-balancing as the use-case?
The current issue with the patchset is that `dmesg` doesn't clearly
mark `raid1_balancing` as experimental. Would it be better to add
"(experimental: )" or simply use "()" to  highlight this?

---
[ 3663.742802] Btrfs loaded, debug=on, assert=on, zoned=yes, 
fsverity=yes, experimental=yes, raid1_balancing=round-robin
---

or
---
[ 3663.742802] Btrfs loaded, debug=on, assert=on, zoned=yes, 
fsverity=yes, (experimental: raid1_balancing=round-robin)
---

or
---
[ 3663.742802] Btrfs loaded, debug=on, assert=on, zoned=yes, 
fsverity=yes, (raid1_balancing=round-robin)
---



> balancing feature, in the middle of this patchset?
> These should go as a separate patchset...
> 
> Also for the second patch, there's already a fix from yesterday and in for-next:
> 
> https://lore.kernel.org/linux-btrfs/c7b550091f427a79ec5a9aa6c5ac6b5efbdb4e8f.1731605782.git.fdmanana@suse.com/
> 

Thanks for the review.

-Anand

> Thanks.
> 
>>    btrfs: enable RAID1 balancing configuration via modprobe parameter
>>    btrfs: modload to print RAID1 balancing status
>>
>>   fs/btrfs/disk-io.c |   1 +
>>   fs/btrfs/super.c   |  22 +++++-
>>   fs/btrfs/sysfs.c   | 181 ++++++++++++++++++++++++++++++++++++++++-----
>>   fs/btrfs/sysfs.h   |   5 ++
>>   fs/btrfs/volumes.c |  86 ++++++++++++++++++++-
>>   fs/btrfs/volumes.h |  14 ++++
>>   6 files changed, 286 insertions(+), 23 deletions(-)
>>
>> --
>> 2.46.1
>>
>>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing
  2024-11-15 14:54 ` [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing Anand Jain
@ 2024-11-18  7:34   ` kernel test robot
  2024-12-06 17:13   ` David Sterba
  1 sibling, 0 replies; 29+ messages in thread
From: kernel test robot @ 2024-11-18  7:34 UTC (permalink / raw)
  To: Anand Jain, linux-btrfs; +Cc: oe-kbuild-all, dsterba, wqu, hrx, waxhead

Hi Anand,

kernel test robot noticed the following build errors:

[auto build test ERROR on kdave/for-next]
[also build test ERROR on next-20241115]
[cannot apply to linus/master v6.12]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Anand-Jain/btrfs-initialize-fs_devices-fs_info-earlier/20241116-230928
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link:    https://lore.kernel.org/r/995d4a9dd9f553825805efdac24dec4a9de20ef3.1731076425.git.anand.jain%40oracle.com
patch subject: [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing
config: m68k-randconfig-r113-20241117 (https://download.01.org/0day-ci/archive/20241118/202411181508.CyjUUrnX-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 14.2.0
reproduce: (https://download.01.org/0day-ci/archive/20241118/202411181508.CyjUUrnX-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411181508.CyjUUrnX-lkp@intel.com/

All errors (new ones prefixed by >>):

   m68k-linux-ld: fs/btrfs/sysfs.o: in function `btrfs_read_policy_store':
>> sysfs.c:(.text+0x197a): undefined reference to `__moddi3'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 03/10] btrfs: add btrfs_read_policy_to_enum helper and refactor read policy store
  2024-11-15 14:54 ` [PATCH v3 03/10] btrfs: add btrfs_read_policy_to_enum helper and refactor read policy store Anand Jain
@ 2024-12-06 16:47   ` David Sterba
  0 siblings, 0 replies; 29+ messages in thread
From: David Sterba @ 2024-12-06 16:47 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On Fri, Nov 15, 2024 at 10:54:03PM +0800, Anand Jain wrote:
> Introduce the `btrfs_read_policy_to_enum` helper function to simplify the
> conversion of a string read policy to its corresponding enum value. This
> reduces duplication and improves code clarity in `btrfs_read_policy_store`.
> The `btrfs_read_policy_store` function has been refactored to use the new
> helper.
> 
> The parameter is copied locally to allow modification, enabling the
> separation of the method and its value. This prepares for the addition of
> more functionality in subsequent patches.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/sysfs.c | 50 ++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 38 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index fd3c49c6c3c5..7506818ec45f 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -1307,6 +1307,34 @@ BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
>  
>  static const char * const btrfs_read_policy_name[] = { "pid" };
>  
> +static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str)
> +{
> +	bool found = false;
> +	enum btrfs_read_policy index;
> +	char *param;
> +
> +	if (!str || !strlen(str))

For ineger values please use '== 0'

> +		return 0;
> +
> +	param = kstrdup(str, GFP_KERNEL);

Why do you need the dynamic allocation? The policy strings won't be
long, so a 32 bytes on stack can be sufficient.

> +	if (!param)
> +		return -ENOMEM;
> +
> +	for (index = 0; index < BTRFS_NR_READ_POLICY; index++) {
> +		if (sysfs_streq(param, btrfs_read_policy_name[index])) {
> +			found = true;
> +			break;
> +		}
> +	}
> +
> +	kfree(param);
> +
> +	if (found)
> +		return index;
> +
> +	return -EINVAL;
> +}

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 04/10] btrfs: handle value associated with raid1 balancing parameter
  2024-11-15 14:54 ` [PATCH v3 04/10] btrfs: handle value associated with raid1 balancing parameter Anand Jain
@ 2024-12-06 16:53   ` David Sterba
  2024-12-16  9:38     ` Anand Jain
  0 siblings, 1 reply; 29+ messages in thread
From: David Sterba @ 2024-12-06 16:53 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On Fri, Nov 15, 2024 at 10:54:04PM +0800, Anand Jain wrote:
> This change enables specifying additional configuration values alongside
> the raid1 balancing / read policy in a single input string.
> 
> Updated btrfs_read_policy_to_enum() to parse and handle a value associated
> with the policy in the format `policy:value`, the value part if present is
> converted 64-bit integer. Update btrfs_read_policy_store() to accommodate
> the new parameter.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/sysfs.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index 7506818ec45f..7907507b8ced 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -1307,8 +1307,11 @@ BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
>  
>  static const char * const btrfs_read_policy_name[] = { "pid" };
>  
> -static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str)
> +static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value)

Why is value signed type?

>  {
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +	char *value_str;
> +#endif

Please keep the inline #ifdefs to minimum, with __maybe_unused there
won't be any warning if experimental build is disabled.

>  	bool found = false;
>  	enum btrfs_read_policy index;
>  	char *param;
> @@ -1320,6 +1323,18 @@ static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str)
>  	if (!param)
>  		return -ENOMEM;
>  
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +	/* Separate value from input in policy:value format. */
> +	if ((value_str = strchr(param, ':'))) {
> +		*value_str = '\0';
> +		value_str++;
> +		if (value && kstrtou64(value_str, 10, value) != 0) {

You can assume that the 'value' is always non zero and assert it at the
beginning of the function.

> +			kfree(param);
> +			return -EINVAL;

A negative errno in a function returning enum looks quite strange,
either add an enum type for invalid value or use 'int'.

> +		}
> +	}
> +#endif
> +
>  	for (index = 0; index < BTRFS_NR_READ_POLICY; index++) {
>  		if (sysfs_streq(param, btrfs_read_policy_name[index])) {
>  			found = true;
> @@ -1367,8 +1382,9 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
>  {
>  	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
>  	enum btrfs_read_policy index;
> +	s64 value = -1;
>  
> -	index = btrfs_read_policy_to_enum(buf);
> +	index = btrfs_read_policy_to_enum(buf, &value);
>  	if (index == -EINVAL)
>  		return -EINVAL;
>  
> -- 
> 2.46.1
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing
  2024-11-15 14:54 ` [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing Anand Jain
  2024-11-18  7:34   ` kernel test robot
@ 2024-12-06 17:13   ` David Sterba
  2024-12-16  9:34     ` Anand Jain
  1 sibling, 1 reply; 29+ messages in thread
From: David Sterba @ 2024-12-06 17:13 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On Fri, Nov 15, 2024 at 10:54:05PM +0800, Anand Jain wrote:
> This feature balances I/O across the striped devices when reading from
> RAID1 blocks.
> 
>    echo round-robin:[min_contiguous_read] > /sys/fs/btrfs/<uuid>/read_policy

The ":" is part of the optional value, so the example should be
"round-robin[:min]"

> The min_contiguous_read parameter defines the minimum read size before
> switching to the next mirrored device. This setting is optional, with a
> default value of 256 KiB.
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/sysfs.c   | 38 +++++++++++++++++++++++++++++
>  fs/btrfs/volumes.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++
>  fs/btrfs/volumes.h |  9 +++++++
>  3 files changed, 107 insertions(+)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index 7907507b8ced..092a78298d1a 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -1305,7 +1305,11 @@ static ssize_t btrfs_temp_fsid_show(struct kobject *kobj,
>  }
>  BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
>  
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +static const char * const btrfs_read_policy_name[] = { "pid", "round-robin" };
> +#else
>  static const char * const btrfs_read_policy_name[] = { "pid" };
> +#endif

Instead of the duplication of the definition please put the #ifdef
around the experimental strings.

>  
>  static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value)
>  {
> @@ -1367,6 +1371,12 @@ static ssize_t btrfs_read_policy_show(struct kobject *kobj,
>  
>  		ret += sysfs_emit_at(buf, ret, "%s", btrfs_read_policy_name[i]);
>  
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +		if (i == BTRFS_READ_POLICY_RR)
> +			ret += sysfs_emit_at(buf, ret, ":%d",
> +					     fs_devices->min_contiguous_read);

Maybe this should be printed with the suffix, "rr:256k".

> +#endif
> +
>  		if (i == policy)
>  			ret += sysfs_emit_at(buf, ret, "]");
>  	}
> @@ -1388,6 +1398,34 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
>  	if (index == -EINVAL)
>  		return -EINVAL;
>  
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +	if (index == BTRFS_READ_POLICY_RR) {
> +		if (value != -1) {
> +			if ((value % fs_devices->fs_info->sectorsize) != 0) {

I think 64bit types and % could lead to the division which fails on
32bit platforms. As it's the s64 type for no apparent reason a u32 would
be a better fit.

> +				btrfs_err(fs_devices->fs_info,
> +"read_policy: min_contiguous_read %lld should be multiples of the sectorsize %u",
> +					  value, fs_devices->fs_info->sectorsize);
> +				return -EINVAL;

This does not need to fail hard, we can simply round it up to the next
sector size multiple.

Also this does not validate the upper bound of value.

> +			}
> +		} else {
> +			/* value is not provided, set it to the default 256k */
> +			value = 256 * 1024;

Please use a named constant, then the comment is not necessary.

> +		}
> +
> +		if (index != READ_ONCE(fs_devices->read_policy) ||
> +		    value != READ_ONCE(fs_devices->min_contiguous_read)) {
> +			WRITE_ONCE(fs_devices->read_policy, index);
> +			WRITE_ONCE(fs_devices->min_contiguous_read, value);
> +			atomic_set(&fs_devices->total_reads, 0);
> +
> +			btrfs_info(fs_devices->fs_info, "read policy set to '%s:%lld'",
> +				   btrfs_read_policy_name[index], value);
> +
> +		}
> +
> +		return len;
> +	}
> +#endif
>  	if (index != READ_ONCE(fs_devices->read_policy)) {
>  		WRITE_ONCE(fs_devices->read_policy, index);
>  		btrfs_info(fs_devices->fs_info, "read policy set to '%s'",
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index fe5ceea2ba0b..97576a715191 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1328,6 +1328,10 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
>  	fs_devices->total_rw_bytes = 0;
>  	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
>  	fs_devices->read_policy = BTRFS_READ_POLICY_PID;
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +	/* Set min_contiguous_read to a default 256kib */
> +	fs_devices->min_contiguous_read = 256 * 1024;

Named constant.

> +#endif
>  
>  	return 0;
>  }
> @@ -5959,6 +5963,57 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info *fs_info,
>  	return len;
>  }
>  
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +struct stripe_mirror {
> +	u64 devid;
> +	int num;
> +};
> +
> +static int btrfs_cmp_devid(const void *a, const void *b)
> +{
> +	struct stripe_mirror *s1 = (struct stripe_mirror *)a;
> +	struct stripe_mirror *s2 = (struct stripe_mirror *)b;

Please keep the const qualifier.

> +
> +	if (s1->devid < s2->devid)
> +		return -1;
> +	if (s1->devid > s2->devid)
> +		return 1;
> +	return 0;
> +}
> +
> +static int btrfs_read_rr(struct btrfs_chunk_map *map, int first, int num_stripe)

Please add a comment explaining what the function does or how it selects
the device.

> +{
> +	struct stripe_mirror stripes[4] = {0}; //4: max possible mirrors

Another magic constant. We probably won't have more than 4 but it still
needs to be defined elsewhre. Also the comment format is wrong.

> +	struct btrfs_fs_devices *fs_devices;
> +	struct btrfs_device *device;
> +	int j;

This is used in the for loop only, so you can declare it as
for (int j = ...)

> +	int read_cycle;
> +	int index;
> +	int ret_stripe;
> +	int total_reads;
> +	int reads_per_dev = 0;
> +
> +	device = map->stripes[first].dev;
> +
> +	fs_devices = device->fs_devices;
> +	reads_per_dev = fs_devices->min_contiguous_read/fs_devices->fs_info->sectorsize;

Missing space around '/' and we have the sectorsize_bits so division can
be replaced by >>

> +	index = 0;
> +	for (j = first; j < first + num_stripe; j++) {

Actually you can use 'i' for iteration.

> +		stripes[index].devid = map->stripes[j].dev->devid;
> +		stripes[index].num = j;
> +		index++;
> +	}
> +	sort(stripes, num_stripe, sizeof(struct stripe_mirror),
> +	     btrfs_cmp_devid, NULL);
> +
> +	total_reads = atomic_inc_return(&fs_devices->total_reads);
> +	read_cycle = total_reads/reads_per_dev;

Missing spaces around "/"

> +	ret_stripe = stripes[read_cycle % num_stripe].num;
> +
> +	return ret_stripe;
> +}
> +#endif
> +
>  static int find_live_mirror(struct btrfs_fs_info *fs_info,
>  			    struct btrfs_chunk_map *map, int first,
>  			    int dev_replace_is_ongoing)
> @@ -5988,6 +6043,11 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
>  	case BTRFS_READ_POLICY_PID:
>  		preferred_mirror = first + (current->pid % num_stripes);
>  		break;
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +	case BTRFS_READ_POLICY_RR:
> +		preferred_mirror = btrfs_read_rr(map, first, num_stripes);
> +		break;
> +#endif
>  	}
>  
>  	if (dev_replace_is_ongoing &&
> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
> index 3a416b1bc24c..05778361c270 100644
> --- a/fs/btrfs/volumes.h
> +++ b/fs/btrfs/volumes.h
> @@ -303,6 +303,10 @@ enum btrfs_chunk_allocation_policy {
>  enum btrfs_read_policy {
>  	/* Use process PID to choose the stripe */
>  	BTRFS_READ_POLICY_PID,
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +	/* Balancing raid1 reads across all striped devices (round-robin) */
> +	BTRFS_READ_POLICY_RR,
> +#endif
>  	BTRFS_NR_READ_POLICY,
>  };
>  
> @@ -431,6 +435,11 @@ struct btrfs_fs_devices {
>  	enum btrfs_read_policy read_policy;
>  
>  #ifdef CONFIG_BTRFS_EXPERIMENTAL
> +	/* IO stat, read counter. */
> +	atomic_t total_reads;
> +	/* Min contiguous reads before switching to next device. */
> +	int min_contiguous_read;

As this is in the fs_devices, there should be some prefix that it's
related to the read policy or round robin.

> +
>  	/* Checksum mode - offload it or do it synchronously. */
>  	enum btrfs_offload_csum_mode offload_csum_mode;
>  #endif
> -- 
> 2.46.1
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status
  2024-11-15 14:54 ` [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status Anand Jain
@ 2024-12-06 17:16   ` David Sterba
  2024-12-16 15:14     ` Anand Jain
  2024-12-06 17:18   ` David Sterba
  1 sibling, 1 reply; 29+ messages in thread
From: David Sterba @ 2024-12-06 17:16 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On Fri, Nov 15, 2024 at 10:54:07PM +0800, Anand Jain wrote:
> Commit c9c49e8f157e ("btrfs: split out CONFIG_BTRFS_EXPERIMENTAL from
> CONFIG_BTRFS_DEBUG") introduces a way to enable or disable experimental
> features, print its status during module load, like so:
> 
> Btrfs loaded, debug=on, assert=on, zoned=yes, fsverity=yes
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/super.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 6cc9291c4552..d52f7f6e2de7 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -2487,6 +2487,11 @@ static int __init btrfs_print_mod_info(void)
>  			", fsverity=yes"
>  #else
>  			", fsverity=no"
> +#endif
> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
> +			", experimental=yes"
> +#else
> +			", experimental=no"

I think this should be the first in the list so it's obvious. Also I'm
not sure if we need to print it when it's disabled.

>  #endif
>  			;
>  	pr_info("Btrfs loaded%s\n", options);
> -- 
> 2.46.1
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status
  2024-11-15 14:54 ` [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status Anand Jain
  2024-12-06 17:16   ` David Sterba
@ 2024-12-06 17:18   ` David Sterba
  1 sibling, 0 replies; 29+ messages in thread
From: David Sterba @ 2024-12-06 17:18 UTC (permalink / raw)
  To: Anand Jain; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On Fri, Nov 15, 2024 at 10:54:07PM +0800, Anand Jain wrote:
> Commit c9c49e8f157e ("btrfs: split out CONFIG_BTRFS_EXPERIMENTAL from
> CONFIG_BTRFS_DEBUG") introduces a way to enable or disable experimental
> features, print its status during module load, like so:
> 
> Btrfs loaded, debug=on, assert=on, zoned=yes, fsverity=yes
> 
> Signed-off-by: Anand Jain <anand.jain@oracle.com>

Oh and please change the subject so it's actually understandable, 'pr'
is not a verb.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing
  2024-12-06 17:13   ` David Sterba
@ 2024-12-16  9:34     ` Anand Jain
  0 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-12-16  9:34 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On 6/12/24 22:43, David Sterba wrote:
> On Fri, Nov 15, 2024 at 10:54:05PM +0800, Anand Jain wrote:
>> This feature balances I/O across the striped devices when reading from
>> RAID1 blocks.
>>
>>     echo round-robin:[min_contiguous_read] > /sys/fs/btrfs/<uuid>/read_policy
> 
> The ":" is part of the optional value, so the example should be
> "round-robin[:min]"
> 

fixed.

>> The min_contiguous_read parameter defines the minimum read size before
>> switching to the next mirrored device. This setting is optional, with a
>> default value of 256 KiB.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   fs/btrfs/sysfs.c   | 38 +++++++++++++++++++++++++++++
>>   fs/btrfs/volumes.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++
>>   fs/btrfs/volumes.h |  9 +++++++
>>   3 files changed, 107 insertions(+)
>>
>> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
>> index 7907507b8ced..092a78298d1a 100644
>> --- a/fs/btrfs/sysfs.c
>> +++ b/fs/btrfs/sysfs.c
>> @@ -1305,7 +1305,11 @@ static ssize_t btrfs_temp_fsid_show(struct kobject *kobj,
>>   }
>>   BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
>>   
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +static const char * const btrfs_read_policy_name[] = { "pid", "round-robin" };
>> +#else
>>   static const char * const btrfs_read_policy_name[] = { "pid" };
>> +#endif
> 
> Instead of the duplication of the definition please put the #ifdef
> around the experimental strings.
> 

Now fixed.

>>   
>>   static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value)
>>   {
>> @@ -1367,6 +1371,12 @@ static ssize_t btrfs_read_policy_show(struct kobject *kobj,
>>   
>>   		ret += sysfs_emit_at(buf, ret, "%s", btrfs_read_policy_name[i]);
>>   
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +		if (i == BTRFS_READ_POLICY_RR)
>> +			ret += sysfs_emit_at(buf, ret, ":%d",
>> +					     fs_devices->min_contiguous_read);
> 
> Maybe this should be printed with the suffix, "rr:256k".
> 

If the output has a suffix, we should also accept an input suffix.
Perhaps I could do that separately after this patch set?

>> +#endif
>> +
>>   		if (i == policy)
>>   			ret += sysfs_emit_at(buf, ret, "]");
>>   	}
>> @@ -1388,6 +1398,34 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
>>   	if (index == -EINVAL)
>>   		return -EINVAL;
>>   
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +	if (index == BTRFS_READ_POLICY_RR) {
>> +		if (value != -1) {
>> +			if ((value % fs_devices->fs_info->sectorsize) != 0) {
> 
> I think 64bit types and % could lead to the division which fails on
> 32bit platforms. As it's the s64 type for no apparent reason a u32 would
> be a better fit.
> 

Yeah, that will be a problem. In v4, I'm using IS_ALIGNED(), and we need
the value in the argument as a u64 because it also represents a devid.


>> +				btrfs_err(fs_devices->fs_info,
>> +"read_policy: min_contiguous_read %lld should be multiples of the sectorsize %u",
>> +					  value, fs_devices->fs_info->sectorsize);
>> +				return -EINVAL;
> 
> This does not need to fail hard, we can simply round it up to the next
> sector size multiple.
> 

Oh yeah that will be nice. Now fixed.

> Also this does not validate the upper bound of value.
> 

Um, IMO, there's no need to validate the upper bound as long as
it is sector-size aligned, because it always upto the use
case to decide. And, there is no theoretical limit we could set
in the code.

>> +			}
>> +		} else {
>> +			/* value is not provided, set it to the default 256k */
>> +			value = 256 * 1024;
> 
> Please use a named constant, then the comment is not necessary.
> 

Done in v4.

>> +		}
>> +
>> +		if (index != READ_ONCE(fs_devices->read_policy) ||
>> +		    value != READ_ONCE(fs_devices->min_contiguous_read)) {
>> +			WRITE_ONCE(fs_devices->read_policy, index);
>> +			WRITE_ONCE(fs_devices->min_contiguous_read, value);
>> +			atomic_set(&fs_devices->total_reads, 0);
>> +
>> +			btrfs_info(fs_devices->fs_info, "read policy set to '%s:%lld'",
>> +				   btrfs_read_policy_name[index], value);
>> +
>> +		}
>> +
>> +		return len;
>> +	}
>> +#endif
>>   	if (index != READ_ONCE(fs_devices->read_policy)) {
>>   		WRITE_ONCE(fs_devices->read_policy, index);
>>   		btrfs_info(fs_devices->fs_info, "read policy set to '%s'",
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index fe5ceea2ba0b..97576a715191 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1328,6 +1328,10 @@ static int open_fs_devices(struct btrfs_fs_devices *fs_devices,
>>   	fs_devices->total_rw_bytes = 0;
>>   	fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
>>   	fs_devices->read_policy = BTRFS_READ_POLICY_PID;
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +	/* Set min_contiguous_read to a default 256kib */
>> +	fs_devices->min_contiguous_read = 256 * 1024;
> 
> Named constant.
> 

Fixed.

>> +#endif
>>   
>>   	return 0;
>>   }
>> @@ -5959,6 +5963,57 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info *fs_info,
>>   	return len;
>>   }
>>   
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +struct stripe_mirror {
>> +	u64 devid;
>> +	int num;
>> +};
>> +
>> +static int btrfs_cmp_devid(const void *a, const void *b)
>> +{
>> +	struct stripe_mirror *s1 = (struct stripe_mirror *)a;
>> +	struct stripe_mirror *s2 = (struct stripe_mirror *)b;
> 
> Please keep the const qualifier.
> 

Fixed.

>> +
>> +	if (s1->devid < s2->devid)
>> +		return -1;
>> +	if (s1->devid > s2->devid)
>> +		return 1;
>> +	return 0;
>> +}
>> +
>> +static int btrfs_read_rr(struct btrfs_chunk_map *map, int first, int num_stripe)
> 
> Please add a comment explaining what the function does or how it selects
> the device.
> 

Added.

>> +{
>> +	struct stripe_mirror stripes[4] = {0}; //4: max possible mirrors
> 
> Another magic constant. We probably won't have more than 4 but it still
> needs to be defined elsewhre. Also the comment format is wrong.
> 

Fixed in v4 (in volume.h).


>> +	struct btrfs_fs_devices *fs_devices;
>> +	struct btrfs_device *device;
>> +	int j;
> 
> This is used in the for loop only, so you can declare it as
> for (int j = ...)
> 

Fixed.

>> +	int read_cycle;
>> +	int index;
>> +	int ret_stripe;
>> +	int total_reads;
>> +	int reads_per_dev = 0;
>> +
>> +	device = map->stripes[first].dev;
>> +
>> +	fs_devices = device->fs_devices;
>> +	reads_per_dev = fs_devices->min_contiguous_read/fs_devices->fs_info->sectorsize;
> 
> Missing space around '/' and we have the sectorsize_bits so division can
> be replaced by >>
> 

Fixed.

>> +	index = 0;
>> +	for (j = first; j < first + num_stripe; j++) {
> 
> Actually you can use 'i' for iteration.

Fixed.

> 
>> +		stripes[index].devid = map->stripes[j].dev->devid;
>> +		stripes[index].num = j;
>> +		index++;
>> +	}
>> +	sort(stripes, num_stripe, sizeof(struct stripe_mirror),
>> +	     btrfs_cmp_devid, NULL);
>> +
>> +	total_reads = atomic_inc_return(&fs_devices->total_reads);
>> +	read_cycle = total_reads/reads_per_dev;
> 
> Missing spaces around "/"
> 

Fixed.


>> +	ret_stripe = stripes[read_cycle % num_stripe].num;
>> +
>> +	return ret_stripe;
>> +}
>> +#endif
>> +
>>   static int find_live_mirror(struct btrfs_fs_info *fs_info,
>>   			    struct btrfs_chunk_map *map, int first,
>>   			    int dev_replace_is_ongoing)
>> @@ -5988,6 +6043,11 @@ static int find_live_mirror(struct btrfs_fs_info *fs_info,
>>   	case BTRFS_READ_POLICY_PID:
>>   		preferred_mirror = first + (current->pid % num_stripes);
>>   		break;
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +	case BTRFS_READ_POLICY_RR:
>> +		preferred_mirror = btrfs_read_rr(map, first, num_stripes);
>> +		break;
>> +#endif
>>   	}
>>   
>>   	if (dev_replace_is_ongoing &&
>> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
>> index 3a416b1bc24c..05778361c270 100644
>> --- a/fs/btrfs/volumes.h
>> +++ b/fs/btrfs/volumes.h
>> @@ -303,6 +303,10 @@ enum btrfs_chunk_allocation_policy {
>>   enum btrfs_read_policy {
>>   	/* Use process PID to choose the stripe */
>>   	BTRFS_READ_POLICY_PID,
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +	/* Balancing raid1 reads across all striped devices (round-robin) */
>> +	BTRFS_READ_POLICY_RR,
>> +#endif
>>   	BTRFS_NR_READ_POLICY,
>>   };
>>   
>> @@ -431,6 +435,11 @@ struct btrfs_fs_devices {
>>   	enum btrfs_read_policy read_policy;
>>   
>>   #ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +	/* IO stat, read counter. */
>> +	atomic_t total_reads;
>> +	/* Min contiguous reads before switching to next device. */
>> +	int min_contiguous_read;
> 
> As this is in the fs_devices, there should be some prefix that it's
> related to the read policy or round robin.
> 

Naming is bit challenging given that it is long already,
I have renamed to rr_min_contiguous_read.


Thanks, Anand

>> +
>>   	/* Checksum mode - offload it or do it synchronously. */
>>   	enum btrfs_offload_csum_mode offload_csum_mode;
>>   #endif
>> -- 
>> 2.46.1
>>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 04/10] btrfs: handle value associated with raid1 balancing parameter
  2024-12-06 16:53   ` David Sterba
@ 2024-12-16  9:38     ` Anand Jain
  0 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-12-16  9:38 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On 6/12/24 22:23, David Sterba wrote:
> On Fri, Nov 15, 2024 at 10:54:04PM +0800, Anand Jain wrote:
>> This change enables specifying additional configuration values alongside
>> the raid1 balancing / read policy in a single input string.
>>
>> Updated btrfs_read_policy_to_enum() to parse and handle a value associated
>> with the policy in the format `policy:value`, the value part if present is
>> converted 64-bit integer. Update btrfs_read_policy_store() to accommodate
>> the new parameter.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   fs/btrfs/sysfs.c | 20 ++++++++++++++++++--
>>   1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
>> index 7506818ec45f..7907507b8ced 100644
>> --- a/fs/btrfs/sysfs.c
>> +++ b/fs/btrfs/sysfs.c
>> @@ -1307,8 +1307,11 @@ BTRFS_ATTR(, temp_fsid, btrfs_temp_fsid_show);
>>   
>>   static const char * const btrfs_read_policy_name[] = { "pid" };
>>   
>> -static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str)
>> +static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str, s64 *value)
> 
> Why is value signed type?
> 

  The signed type is intended for future enhancements..
  For the "Device" raid1-balancing method, we need to specify either
  the actual device ID (as of now) or a negative number (in future)
  representing a device with lowest generation number. Perhaps -1?
  I'm ok to make it u64, if there is a better way.


>>   {
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +	char *value_str;
>> +#endif
> 
> Please keep the inline #ifdefs to minimum, with __maybe_unused there
> won't be any warning if experimental build is disabled.
> 

  Oh, right. Will fix.

>>   	bool found = false;
>>   	enum btrfs_read_policy index;
>>   	char *param;
>> @@ -1320,6 +1323,18 @@ static enum btrfs_read_policy btrfs_read_policy_to_enum(const char *str)
>>   	if (!param)
>>   		return -ENOMEM;
>>   
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +	/* Separate value from input in policy:value format. */
>> +	if ((value_str = strchr(param, ':'))) {
>> +		*value_str = '\0';
>> +		value_str++;
>> +		if (value && kstrtou64(value_str, 10, value) != 0) {
> 
> You can assume that the 'value' is always non zero and assert it at the
> beginning of the function.
> 

  In v3 I had to check because patch 8/10 sends out arg with NULL.
  In v4 I have revised with a common helper in which the assert
  will help.

>> +			kfree(param);
>> +			return -EINVAL;
> 
> A negative errno in a function returning enum looks quite strange,
> either add an enum type for invalid value or use 'int'.
> 

now declared it is int. It was simpler to implement.

Thanks.

>> +		}
>> +	}
>> +#endif
>> +
>>   	for (index = 0; index < BTRFS_NR_READ_POLICY; index++) {
>>   		if (sysfs_streq(param, btrfs_read_policy_name[index])) {
>>   			found = true;
>> @@ -1367,8 +1382,9 @@ static ssize_t btrfs_read_policy_store(struct kobject *kobj,
>>   {
>>   	struct btrfs_fs_devices *fs_devices = to_fs_devs(kobj);
>>   	enum btrfs_read_policy index;
>> +	s64 value = -1;
>>   
>> -	index = btrfs_read_policy_to_enum(buf);
>> +	index = btrfs_read_policy_to_enum(buf, &value);
>>   	if (index == -EINVAL)
>>   		return -EINVAL;
>>   
>> -- 
>> 2.46.1
>>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status
  2024-12-06 17:16   ` David Sterba
@ 2024-12-16 15:14     ` Anand Jain
  0 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2024-12-16 15:14 UTC (permalink / raw)
  To: dsterba; +Cc: linux-btrfs, dsterba, wqu, hrx, waxhead

On 6/12/24 22:46, David Sterba wrote:
> On Fri, Nov 15, 2024 at 10:54:07PM +0800, Anand Jain wrote:
>> Commit c9c49e8f157e ("btrfs: split out CONFIG_BTRFS_EXPERIMENTAL from
>> CONFIG_BTRFS_DEBUG") introduces a way to enable or disable experimental
>> features, print its status during module load, like so:
>>
>> Btrfs loaded, debug=on, assert=on, zoned=yes, fsverity=yes
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   fs/btrfs/super.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
>> index 6cc9291c4552..d52f7f6e2de7 100644
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -2487,6 +2487,11 @@ static int __init btrfs_print_mod_info(void)
>>   			", fsverity=yes"
>>   #else
>>   			", fsverity=no"
>> +#endif
>> +#ifdef CONFIG_BTRFS_EXPERIMENTAL
>> +			", experimental=yes"
>> +#else
>> +			", experimental=no"
> 
> I think this should be the first in the list so it's obvious.


> Also I'm
> not sure if we need to print it when it's disabled.

It should be fine. It matches with debug and asset; they aren't
printed when disabled.

So, currently:
    Btrfs loaded, debug=on, assert=on, zoned=yes, fsverity=yes, 
experimental=yes

Fixed to:
    Btrfs loaded, experimental=on, debug=on, assert=on, zoned=yes, 
fsverity=yes

And when disabled.
    Btrfs loaded, debug=on, assert=on, zoned=yes, fsverity=yes

Thx.


>>   #endif
>>   			;
>>   	pr_info("Btrfs loaded%s\n", options);
>> -- 
>> 2.46.1
>>


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2024-12-16 15:14 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-15 14:54 [PATCH v3 00/10] raid1 balancing methods Anand Jain
2024-11-15 14:54 ` [PATCH v3 01/10] btrfs: initialize fs_devices->fs_info earlier Anand Jain
2024-11-15 14:54 ` [PATCH v3 02/10] btrfs: simplify output formatting in btrfs_read_policy_show Anand Jain
2024-11-15 14:54 ` [PATCH v3 03/10] btrfs: add btrfs_read_policy_to_enum helper and refactor read policy store Anand Jain
2024-12-06 16:47   ` David Sterba
2024-11-15 14:54 ` [PATCH v3 04/10] btrfs: handle value associated with raid1 balancing parameter Anand Jain
2024-12-06 16:53   ` David Sterba
2024-12-16  9:38     ` Anand Jain
2024-11-15 14:54 ` [PATCH v3 05/10] btrfs: introduce RAID1 round-robin read balancing Anand Jain
2024-11-18  7:34   ` kernel test robot
2024-12-06 17:13   ` David Sterba
2024-12-16  9:34     ` Anand Jain
2024-11-15 14:54 ` [PATCH v3 06/10] btrfs: add RAID1 preferred read device Anand Jain
2024-11-15 14:54 ` [PATCH v3 07/10] btrfs: pr CONFIG_BTRFS_EXPERIMENTAL status Anand Jain
2024-12-06 17:16   ` David Sterba
2024-12-16 15:14     ` Anand Jain
2024-12-06 17:18   ` David Sterba
2024-11-15 14:54 ` [PATCH v3 08/10] btrfs: fix CONFIG_BTRFS_EXPERIMENTAL migration Anand Jain
2024-11-15 14:54 ` [PATCH v3 09/10] btrfs: enable RAID1 balancing configuration via modprobe parameter Anand Jain
2024-11-15 14:54 ` [PATCH v3 10/10] btrfs: modload to print RAID1 balancing status Anand Jain
2024-11-15 15:20 ` [PATCH 0/2] fstests: refine filesystem module reload handling Anand Jain
2024-11-15 15:20   ` [PATCH 1/2] fstests: move fs-module reload to earlier in the run_section function Anand Jain
2024-11-15 16:39     ` Darrick J. Wong
2024-11-18  0:12       ` Anand Jain
2024-11-15 16:48     ` Filipe Manana
2024-11-15 15:20   ` [PATCH 2/2] fstests: FS_MODULE_RELOAD_OPTIONS to control filesystem module reload options Anand Jain
2024-11-15 16:34     ` Darrick J. Wong
2024-11-15 19:16 ` [PATCH v3 00/10] raid1 balancing methods Filipe Manana
2024-11-18  1:15   ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox