[PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken
  2024-08-28  2:11 Xiao Ni
@ 2024-08-28  2:11 ` Xiao Ni
  0 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-08-28  2:11 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: ncroxon, linux-raid

09imsm-assemble can run successfully.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 tests/09imsm-assemble.broken | 6 ------
 1 file changed, 6 deletions(-)
 delete mode 100644 tests/09imsm-assemble.broken

diff --git a/tests/09imsm-assemble.broken b/tests/09imsm-assemble.broken
index a6d4d5cf911b..000000000000
--- a/tests/09imsm-assemble.broken
+++ /dev/null
@@ -1,6 +0,0 @@
-fails infrequently
-
-Fails roughly 1 in 10 runs with errors:
-
-    mdadm: /dev/loop2 is still in use, cannot remove.
-    /dev/loop2 removal from /dev/md/container should have succeeded
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 00/10] mdadm tests fix
@ 2024-09-11  8:54 Xiao Ni
  2024-09-11  8:54 ` [PATCH V2 1/1] mdadm/Grow: Update new level when starting reshape Xiao Ni
                   ` (10 more replies)
  0 siblings, 11 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

This is the fourth patch set which enhance/fix mdadm regression tests.

v2: fix problems for the first patches

Xiao Ni (10):
  mdadm/Grow: Update new level when starting reshape
  mdadm/Grow: Update reshape_progress to need_back after reshape
    finishes
  mdadm/Grow: Can't open raid when running --grow --continue
  mdadm/Grow: sleep a while after removing disk in impose_level
  mdadm/tests: wait until level changes
  mdadm/tests: 07changelevels fix
  mdadm/tests: Remove 07reshape5intr.broken
  mdadm/tests: 07testreshape5 fix
  mdadm/tests: remove 09imsm-assemble.broken
  mdadm/Manage: record errno

 Grow.c                       | 39 +++++++++++++++++++++++++------
 Manage.c                     |  8 ++++---
 dev/null                     |  0
 tests/05r6tor0               |  4 ++++
 tests/07changelevels         | 27 ++++++++++------------
 tests/07changelevels.broken  |  9 --------
 tests/07reshape5intr.broken  | 45 ------------------------------------
 tests/07testreshape5         |  1 +
 tests/07testreshape5.broken  | 12 ----------
 tests/09imsm-assemble.broken |  6 -----
 tests/func.sh                |  4 ++++
 11 files changed, 58 insertions(+), 97 deletions(-)
 create mode 100644 dev/null
 delete mode 100644 tests/07changelevels.broken
 delete mode 100644 tests/07reshape5intr.broken
 delete mode 100644 tests/07testreshape5.broken
 delete mode 100644 tests/09imsm-assemble.broken

-- 
2.32.0 (Apple Git-132)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH V2 1/1] mdadm/Grow: Update new level when starting reshape
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-25  7:51   ` Mariusz Tkaczyk
  2024-09-11  8:54 ` [PATCH V2 2/2] mdadm/Grow: Update reshape_progress to need_back after reshape finishes Xiao Ni
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

Reshape needs to specify a backup file when it can't update data offset
of member disks. For this situation, first, it starts reshape and then
it kicks off mdadm-grow-continue service which does backup job and
monitors the reshape process. The service is a new process, so it needs
to read superblock from member disks to get information.

But in the first step, it doesn't update new level in superblock. So
it can't change level after reshape finishes, because the new level is
not right. So records the new level in the first step.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
v2: format change, add get_linux_version
 Grow.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Grow.c b/Grow.c
index 5810b128aa99..533f301468af 100644
--- a/Grow.c
+++ b/Grow.c
@@ -2941,15 +2941,24 @@ static int impose_reshape(struct mdinfo *sra,
 		 * persists from some earlier problem.
 		 */
 		int err = 0;
+
 		if (sysfs_set_num(sra, NULL, "chunk_size", info->new_chunk) < 0)
 			err = errno;
+
 		if (!err && sysfs_set_num(sra, NULL, "layout",
 					  reshape->after.layout) < 0)
 			err = errno;
+
+		/* new_level is introduced in kernel 6.12 */
+		if (!err && get_linux_version() >= 6012000 &&
+				sysfs_set_num(sra, NULL, "new_level", info->new_level) < 0)
+			err = errno;
+
 		if (!err && subarray_set_num(container, sra, "raid_disks",
 					     reshape->after.data_disks +
 					     reshape->parity) < 0)
 			err = errno;
+
 		if (err) {
 			pr_err("Cannot set device shape for %s\n", devname);
 
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V2 2/2] mdadm/Grow: Update reshape_progress to need_back after reshape finishes
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
  2024-09-11  8:54 ` [PATCH V2 1/1] mdadm/Grow: Update new level when starting reshape Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-11  8:54 ` [PATCH 03/10] mdadm/Grow: Can't open raid when running --grow --continue Xiao Ni
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

It tries to update data offset when kicking off reshape. If it can't
change data offset, it needs to use child_monitor to monitor reshape
progress and do back up job. And it needs to update reshape_progress
to need_back when reshape finishes. If not, it will be in a infinite
loop.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
v2: add empty line after declaration
 Grow.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/Grow.c b/Grow.c
index 533f301468af..3b9f994200aa 100644
--- a/Grow.c
+++ b/Grow.c
@@ -4148,8 +4148,8 @@ int progress_reshape(struct mdinfo *info, struct reshape *reshape,
 		 * waiting forever on a dead array
 		 */
 		char action[SYSFS_MAX_BUF_SIZE];
-		if (sysfs_get_str(info, NULL, "sync_action", action, sizeof(action)) <= 0 ||
-		    strncmp(action, "reshape", 7) != 0)
+
+		if (sysfs_get_str(info, NULL, "sync_action", action, sizeof(action)) <= 0)
 			break;
 		/* Some kernels reset 'sync_completed' to zero
 		 * before setting 'sync_action' to 'idle'.
@@ -4157,12 +4157,18 @@ int progress_reshape(struct mdinfo *info, struct reshape *reshape,
 		 */
 		if (completed == 0 && advancing &&
 		    strncmp(action, "idle", 4) == 0 &&
-		    info->reshape_progress > 0)
+		    info->reshape_progress > 0) {
+			info->reshape_progress = need_backup;
 			break;
+		}
 		if (completed == 0 && !advancing &&
 		    strncmp(action, "idle", 4) == 0 &&
 		    info->reshape_progress <
-		    (info->component_size * reshape->after.data_disks))
+		    (info->component_size * reshape->after.data_disks)) {
+			info->reshape_progress = need_backup;
+			break;
+		}
+		if (strncmp(action, "reshape", 7) != 0)
 			break;
 		sysfs_wait(fd, NULL);
 		if (sysfs_fd_get_ll(fd, &completed) < 0)
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 03/10] mdadm/Grow: Can't open raid when running --grow --continue
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
  2024-09-11  8:54 ` [PATCH V2 1/1] mdadm/Grow: Update new level when starting reshape Xiao Ni
  2024-09-11  8:54 ` [PATCH V2 2/2] mdadm/Grow: Update reshape_progress to need_back after reshape finishes Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-11  8:54 ` [PATCH v2 4/4] mdadm/Grow: sleep a while after removing disk in impose_level Xiao Ni
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

It passes 'array' as devname in Grow_continue. So it fails to
open raid device. Use mdinfo to open raid device.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 Grow.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/Grow.c b/Grow.c
index 6b621aea4ecc..2a7587315817 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3688,9 +3688,12 @@ started:
 		set_array_size(st, info, info->text_version);
 
 	if (info->new_level != reshape.level) {
-		if (fd < 0)
-			fd = open(devname, O_RDONLY);
-		impose_level(fd, info->new_level, devname, verbose);
+		fd = open_dev(sra->sys_name);
+		if (fd < 0) {
+			pr_err("Can't open %s\n", sra->sys_name);
+			goto out;
+		}
+		impose_level(fd, info->new_level, sra->sys_name, verbose);
 		close(fd);
 		if (info->new_level == 0)
 			st->update_tail = NULL;
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/4] mdadm/Grow: sleep a while after removing disk in impose_level
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
                   ` (2 preceding siblings ...)
  2024-09-11  8:54 ` [PATCH 03/10] mdadm/Grow: Can't open raid when running --grow --continue Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-11  8:54 ` [PATCH 05/10] mdadm/tests: wait until level changes Xiao Ni
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

It needs to remove disks when reshaping from raid456 to raid0. In
kernel space it sets MD_RECOVERY_RUNNING. And it will fail to change
level. So wait sometime to let md thread to clear this flag.

This is found by test case 05r6tor0.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
v2: add log to give friendly message
 Grow.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/Grow.c b/Grow.c
index ebb53a0dfe9c..60076f56054c 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3034,6 +3034,13 @@ static int impose_level(int fd, int level, char *devname, int verbose)
 			      makedev(disk.major, disk.minor));
 			hot_remove_disk(fd, makedev(disk.major, disk.minor), 1);
 		}
+		/*
+		 * hot_remove_disk lets kernel set MD_RECOVERY_RUNNING
+		 * and it can't set level. It needs to wait sometime
+		 * to let md thread to clear the flag.
+		 */
+		pr_info("wait 5 seconds to give kernel space to finish job\n");
+		sleep_for(5, 0, true);
 	}
 	c = map_num(pers, level);
 	if (c) {
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 05/10] mdadm/tests: wait until level changes
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
                   ` (3 preceding siblings ...)
  2024-09-11  8:54 ` [PATCH v2 4/4] mdadm/Grow: sleep a while after removing disk in impose_level Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-11  8:54 ` [PATCH 06/10] mdadm/tests: 07changelevels fix Xiao Ni
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

check wait waits reshape finishes, but it doesn't wait level changes.
The level change happens in a forked child progress. So we need to
search the child progress and monitor it.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 tests/05r6tor0 | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tests/05r6tor0 b/tests/05r6tor0
index 2fd51f2ea4bb..b2685b721c2e 100644
--- a/tests/05r6tor0
+++ b/tests/05r6tor0
@@ -13,6 +13,10 @@ check raid5
 testdev $md0 3 19456 512
 mdadm -G $md0 -l0
 check wait; sleep 1
+while ps auxf | grep "mdadm -G" | grep -v grep
+do
+        sleep 1
+done
 check raid0
 testdev $md0 3 19456 512
 mdadm -G $md0 -l5 --add $dev3 $dev4
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 06/10] mdadm/tests: 07changelevels fix
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
                   ` (4 preceding siblings ...)
  2024-09-11  8:54 ` [PATCH 05/10] mdadm/tests: wait until level changes Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-11  8:54 ` [PATCH 07/10] mdadm/tests: Remove 07reshape5intr.broken Xiao Ni
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

There are five changes to this case.

1. remove testdev check. It can't work anymore and check if it's a
block device directly.

2. It can't change level and chunk size at the same time

3. Sleep more than 10s before check wait.
The test devices are small. Sometimes it can finish so quickly once
the reshape just starts. mdadm will be stuck before it waits reshape
to start. So the sync speed is limited. And it restores the sync speed
when it waits reshape to finish. It's good for case without backup
file.

It uses systemd service mdadm-grow-continue to monitor reshape
progress when specifying backup file. If reshape finishes so quickly
before it starts monitoring reshape progress, the daemon will be stuck
too. Because reshape_progress is 0 which means the reshape hasn't been
started. So give more time to let service can get right information
from kernel space.

But before getting these information. It needs to suspend array. At
the same time the reshape is running. The kernel reshape daemon will
update metadata 10s. So it needs to limit the sync speed more than 10s
before restoring sync speed. Then systemd service can suspend array
and start monitoring reshape progress.

4. Wait until mdadm-grow-continue service exits
mdadm --wait doesn't wait systemd service. For the case that needs
backup file, systemd service deletes the backup file after reshape
finishes. In this test case, it runs next case when reshape finishes.
And it fails because it can't create backup file because the backup
file exits.

5. Don't reshape from raid5 to raid1. It can't work now.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 tests/07changelevels        | 27 ++++++++++++---------------
 tests/07changelevels.broken |  9 ---------
 tests/func.sh               |  4 ++++
 3 files changed, 16 insertions(+), 24 deletions(-)
 delete mode 100644 tests/07changelevels.broken

diff --git a/tests/07changelevels b/tests/07changelevels
index a328874ac43f..3df8660e6bae 100644
--- a/tests/07changelevels
+++ b/tests/07changelevels
@@ -10,7 +10,6 @@ export MDADM_GROW_VERIFY=1
 dotest() {
  sleep 2
  check wait
- testdev $md0 $1 19968 64 nd
  blockdev --flushbufs $md0
  cmp -s -n $[textK*1024] $md0 /tmp/RandFile || { echo cmp failed; exit 2; }
  # write something new - shift chars 4 space
@@ -24,7 +23,7 @@ checkgeo() {
  # level raid_disks chunk_size layout
  dev=$1
  shift
- sleep 0.5
+ sleep 15
  check wait
  sleep 1
  for attr in level raid_disks chunk_size layout
@@ -43,22 +42,25 @@ checkgeo() {
 
 bu=/tmp/md-test-backup
 rm -f $bu
-mdadm -CR $md0 -l1 -n2 -x1 $dev0 $dev1 $dev2 -z 19968
-testdev $md0 1 $mdsize1a 64
+mdadm -CR $md0 -l1 -n2 -x1 $dev0 $dev1 $dev2
+[ -b $md0 ] || die "$1 isn't a block device."
 dd if=/tmp/RandFile of=$md0
 dotest 1
 
-mdadm --grow $md0 -l5 -n3 --chunk 64
+mdadm --grow $md0 -l5 -n3
+checkgeo md0 raid5 3
 dotest 2
 
 mdadm $md0 --add $dev3 $dev4
 mdadm --grow $md0 -n4 --chunk 32
+checkgeo md0 raid5 4 $[32*1024]
 dotest 3
 
 mdadm -G $md0 -l6 --backup-file $bu
+checkgeo md0 raid6 5 $[32*1024]
 dotest 3
 
-mdadm -G /dev/md0 --array-size 39936
+mdadm -G /dev/md0 --array-size 37888
 mdadm -G $md0 -n4 --backup-file $bu
 checkgeo md0 raid6 4 $[32*1024]
 dotest 2
@@ -67,14 +69,11 @@ mdadm -G $md0 -l5 --backup-file $bu
 checkgeo md0 raid5 3 $[32*1024]
 dotest 2
 
-mdadm -G /dev/md0 --array-size 19968
+mdadm -G /dev/md0 --array-size 18944
 mdadm -G $md0 -n2 --backup-file $bu
 checkgeo md0 raid5 2 $[32*1024]
 dotest 1
 
-mdadm -G --level=1 $md0
-dotest 1
-
 # now repeat that last few steps only with a degraded array.
 mdadm -S $md0
 mdadm -CR $md0 -l6 -n5 $dev0 $dev1 $dev2 $dev3 $dev4
@@ -83,7 +82,7 @@ dotest 3
 
 mdadm $md0 --fail $dev0
 
-mdadm -G /dev/md0 --array-size 37888
+mdadm -G /dev/md0 --array-size 35840
 mdadm -G $md0 -n4 --backup-file $bu
 dotest 2
 checkgeo md0 raid6 4 $[512*1024]
@@ -103,12 +102,10 @@ dotest 2
 mdadm -G $md0 -l5 --backup-file $bu
 dotest 2
 
-mdadm -G /dev/md0 --array-size 18944
+mdadm -G /dev/md0 --array-size 17920
 mdadm -G $md0 -n2 --backup-file $bu
 dotest 1
 checkgeo md0 raid5 2 $[512*1024]
 mdadm $md0 --fail $dev2
 
-mdadm -G --level=1 $md0
-dotest 1
-checkgeo md0 raid1 2
+mdadm -S $md0
diff --git a/tests/07changelevels.broken b/tests/07changelevels.broken
index 9b930d932c48..000000000000
--- a/tests/07changelevels.broken
+++ /dev/null
@@ -1,9 +0,0 @@
-always fails
-
-Fails with errors:
-
-    mdadm: /dev/loop0 is smaller than given size. 18976K < 19968K + metadata
-    mdadm: /dev/loop1 is smaller than given size. 18976K < 19968K + metadata
-    mdadm: /dev/loop2 is smaller than given size. 18976K < 19968K + metadata
-
-    ERROR: /dev/md0 isn't a block device.
diff --git a/tests/func.sh b/tests/func.sh
index e7ccc4fc66eb..567d91d9173e 100644
--- a/tests/func.sh
+++ b/tests/func.sh
@@ -362,6 +362,10 @@ check() {
 		do
 			sleep 0.5
 		done
+		while ps auxf | grep "mdadm --grow --continue" | grep -v grep
+		do
+			sleep 1
+		done
 		echo $min > /proc/sys/dev/raid/speed_limit_min
 		echo $max > /proc/sys/dev/raid/speed_limit_max
 	;;
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 07/10] mdadm/tests: Remove 07reshape5intr.broken
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
                   ` (5 preceding siblings ...)
  2024-09-11  8:54 ` [PATCH 06/10] mdadm/tests: 07changelevels fix Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-11  8:54 ` [PATCH 08/10] mdadm/tests: 07testreshape5 fix Xiao Ni
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

07reshape5intr can run successfully now.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 tests/07reshape5intr.broken | 45 -------------------------------------
 1 file changed, 45 deletions(-)
 delete mode 100644 tests/07reshape5intr.broken

diff --git a/tests/07reshape5intr.broken b/tests/07reshape5intr.broken
index efe52a667172..000000000000
--- a/tests/07reshape5intr.broken
+++ /dev/null
@@ -1,45 +0,0 @@
-always fails
-
-This patch, recently added to md-next causes the test to always fail:
-
-7e6ba434cc60 ("md: don't unregister sync_thread with reconfig_mutex
-held")
-
-The new error is simply:
-
-   ERROR: no reshape happening
-
-Before the patch, the error seen is below.
-
---
-
-fails infrequently
-
-Fails roughly 1 in 4 runs with errors:
-
-    mdadm: Merging with already-assembled /dev/md/0
-    mdadm: cannot re-read metadata from /dev/loop6 - aborting
-
-    ERROR: no reshape happening
-
-Also have seen a random deadlock:
-
-     INFO: task mdadm:109702 blocked for more than 30 seconds.
-           Not tainted 5.18.0-rc3-eid-vmlocalyes-dbg-00095-g3c2b5427979d #2040
-     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
-     task:mdadm           state:D stack:    0 pid:109702 ppid:     1 flags:0x00004000
-     Call Trace:
-      <TASK>
-      __schedule+0x67e/0x13b0
-      schedule+0x82/0x110
-      mddev_suspend+0x2e1/0x330
-      suspend_lo_store+0xbd/0x140
-      md_attr_store+0xcb/0x130
-      sysfs_kf_write+0x89/0xb0
-      kernfs_fop_write_iter+0x202/0x2c0
-      new_sync_write+0x222/0x330
-      vfs_write+0x3bc/0x4d0
-      ksys_write+0xd9/0x180
-      __x64_sys_write+0x43/0x50
-      do_syscall_64+0x3b/0x90
-      entry_SYSCALL_64_after_hwframe+0x44/0xae
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 08/10] mdadm/tests: 07testreshape5 fix
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
                   ` (6 preceding siblings ...)
  2024-09-11  8:54 ` [PATCH 07/10] mdadm/tests: Remove 07reshape5intr.broken Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-11  8:54 ` [PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken Xiao Ni
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

Init dir to avoid test failure.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 tests/07testreshape5        |  1 +
 tests/07testreshape5.broken | 12 ------------
 2 files changed, 1 insertion(+), 12 deletions(-)
 delete mode 100644 tests/07testreshape5.broken

diff --git a/tests/07testreshape5 b/tests/07testreshape5
index 0e1f25f98bc8..d90fd15e0e61 100644
--- a/tests/07testreshape5
+++ b/tests/07testreshape5
@@ -4,6 +4,7 @@
 # kernel md code to move data into and out of variously
 # shaped md arrays.
 set -x
+dir="."
 layouts=(la ra ls rs)
 for level in 5 6
 do
diff --git a/tests/07testreshape5.broken b/tests/07testreshape5.broken
index a8ce03e491b3..000000000000
--- a/tests/07testreshape5.broken
+++ /dev/null
@@ -1,12 +0,0 @@
-always fails
-
-Test seems to run 'test_stripe' at $dir directory, but $dir is never
-set. If $dir is adjusted to $PWD, the test still fails with:
-
-    mdadm: /dev/loop2 is not suitable for this array.
-    mdadm: create aborted
-    ++ return 1
-    ++ cmp -s -n 8192 /dev/md0 /tmp/RandFile
-    ++ echo cmp failed
-    cmp failed
-    ++ exit 2
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
                   ` (7 preceding siblings ...)
  2024-09-11  8:54 ` [PATCH 08/10] mdadm/tests: 07testreshape5 fix Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-23  9:08   ` Mariusz Tkaczyk
  2024-09-11  8:54 ` [PATCH 10/10] mdadm/Manage: record errno Xiao Ni
  2024-09-23  8:51 ` [PATCH 00/10] mdadm tests fix Mariusz Tkaczyk
  10 siblings, 1 reply; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

09imsm-assemble can run successfully.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 tests/09imsm-assemble.broken | 6 ------
 1 file changed, 6 deletions(-)
 delete mode 100644 tests/09imsm-assemble.broken

diff --git a/tests/09imsm-assemble.broken b/tests/09imsm-assemble.broken
index a6d4d5cf911b..000000000000
--- a/tests/09imsm-assemble.broken
+++ /dev/null
@@ -1,6 +0,0 @@
-fails infrequently
-
-Fails roughly 1 in 10 runs with errors:
-
-    mdadm: /dev/loop2 is still in use, cannot remove.
-    /dev/loop2 removal from /dev/md/container should have succeeded
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 10/10] mdadm/Manage: record errno
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
                   ` (8 preceding siblings ...)
  2024-09-11  8:54 ` [PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken Xiao Ni
@ 2024-09-11  8:54 ` Xiao Ni
  2024-09-23  8:51 ` [PATCH 00/10] mdadm tests fix Mariusz Tkaczyk
  10 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-11  8:54 UTC (permalink / raw)
  To: mariusz.tkaczyk; +Cc: linux-raid, ncroxon

Sometimes it reports:
mdadm: failed to stop array /dev/md0: Success
It's the reason the errno is reset. So record errno during the loop.

Signed-off-by: Xiao Ni <xni@redhat.com>
---
 Manage.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/Manage.c b/Manage.c
index 241de05520d6..aba97df8e122 100644
--- a/Manage.c
+++ b/Manage.c
@@ -238,13 +238,14 @@ int Manage_stop(char *devname, int fd, int verbose, int will_retry)
 					    "array_state",
 					    "inactive")) < 0 &&
 		       errno == EBUSY) {
+			err = errno;
 			sleep_for(0, MSEC_TO_NSEC(200), true);
 			count--;
 		}
 		if (err) {
 			if (verbose >= 0)
 				pr_err("failed to stop array %s: %s\n",
-				       devname, strerror(errno));
+				       devname, strerror(err));
 			rv = 1;
 			goto out;
 		}
@@ -438,14 +439,15 @@ done:
 	count = 25; err = 0;
 	while (count && fd >= 0 &&
 	       (err = ioctl(fd, STOP_ARRAY, NULL)) < 0 && errno == EBUSY) {
+		err = errno;
 		sleep_for(0, MSEC_TO_NSEC(200), true);
 		count --;
 	}
 	if (fd >= 0 && err) {
 		if (verbose >= 0) {
 			pr_err("failed to stop array %s: %s\n",
-			       devname, strerror(errno));
-			if (errno == EBUSY)
+			       devname, strerror(err));
+			if (err == EBUSY)
 				cont_err("Perhaps a running process, mounted filesystem or active volume group?\n");
 		}
 		rv = 1;
-- 
2.32.0 (Apple Git-132)


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 00/10] mdadm tests fix
  2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
                   ` (9 preceding siblings ...)
  2024-09-11  8:54 ` [PATCH 10/10] mdadm/Manage: record errno Xiao Ni
@ 2024-09-23  8:51 ` Mariusz Tkaczyk
  10 siblings, 0 replies; 16+ messages in thread
From: Mariusz Tkaczyk @ 2024-09-23  8:51 UTC (permalink / raw)
  To: Xiao Ni; +Cc: linux-raid, ncroxon

On Wed, 11 Sep 2024 16:54:22 +0800
Xiao Ni <xni@redhat.com> wrote:

> This is the fourth patch set which enhance/fix mdadm regression tests.
> 
> v2: fix problems for the first patches
> 
> Xiao Ni (10):
>   mdadm/Grow: Update new level when starting reshape
>   mdadm/Grow: Update reshape_progress to need_back after reshape
>     finishes
>   mdadm/Grow: Can't open raid when running --grow --continue
>   mdadm/Grow: sleep a while after removing disk in impose_level
>   mdadm/tests: wait until level changes
>   mdadm/tests: 07changelevels fix
>   mdadm/tests: Remove 07reshape5intr.broken
>   mdadm/tests: 07testreshape5 fix
>   mdadm/tests: remove 09imsm-assemble.broken
>   mdadm/Manage: record errno
> 
>  Grow.c                       | 39 +++++++++++++++++++++++++------
>  Manage.c                     |  8 ++++---
>  dev/null                     |  0
>  tests/05r6tor0               |  4 ++++
>  tests/07changelevels         | 27 ++++++++++------------
>  tests/07changelevels.broken  |  9 --------
>  tests/07reshape5intr.broken  | 45 ------------------------------------
>  tests/07testreshape5         |  1 +
>  tests/07testreshape5.broken  | 12 ----------
>  tests/09imsm-assemble.broken |  6 -----
>  tests/func.sh                |  4 ++++
>  11 files changed, 58 insertions(+), 97 deletions(-)
>  create mode 100644 dev/null
>  delete mode 100644 tests/07changelevels.broken
>  delete mode 100644 tests/07reshape5intr.broken
>  delete mode 100644 tests/07testreshape5.broken
>  delete mode 100644 tests/09imsm-assemble.broken
> 

Applied all! We are working on enabling mdadm tests on Github so we will see if
it helps. I'm also aware of dependency to new prop md/new_level.

Sorry for the delay, I had holidays.

Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken
  2024-09-11  8:54 ` [PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken Xiao Ni
@ 2024-09-23  9:08   ` Mariusz Tkaczyk
  0 siblings, 0 replies; 16+ messages in thread
From: Mariusz Tkaczyk @ 2024-09-23  9:08 UTC (permalink / raw)
  To: Xiao Ni; +Cc: linux-raid, ncroxon

On Wed, 11 Sep 2024 16:54:31 +0800
Xiao Ni <xni@redhat.com> wrote:

> 09imsm-assemble can run successfully.
> 
> Signed-off-by: Xiao Ni <xni@redhat.com>
> ---
>  tests/09imsm-assemble.broken | 6 ------
>  1 file changed, 6 deletions(-)
>  delete mode 100644 tests/09imsm-assemble.broken
> 
> diff --git a/tests/09imsm-assemble.broken b/tests/09imsm-assemble.broken
> index a6d4d5cf911b..000000000000
> --- a/tests/09imsm-assemble.broken
> +++ /dev/null
> @@ -1,6 +0,0 @@
> -fails infrequently
> -
> -Fails roughly 1 in 10 runs with errors:
> -
> -    mdadm: /dev/loop2 is still in use, cannot remove.
> -    /dev/loop2 removal from /dev/md/container should have succeeded


you created dev/null (probably accidentally). I will remove it in new
commit because I already merged it.

Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V2 1/1] mdadm/Grow: Update new level when starting reshape
  2024-09-11  8:54 ` [PATCH V2 1/1] mdadm/Grow: Update new level when starting reshape Xiao Ni
@ 2024-09-25  7:51   ` Mariusz Tkaczyk
  2024-09-25 12:57     ` Xiao Ni
  0 siblings, 1 reply; 16+ messages in thread
From: Mariusz Tkaczyk @ 2024-09-25  7:51 UTC (permalink / raw)
  To: Xiao Ni; +Cc: linux-raid, ncroxon

On Wed, 11 Sep 2024 16:54:23 +0800
Xiao Ni <xni@redhat.com> wrote:

> +
> +		/* new_level is introduced in kernel 6.12 */
> +		if (!err && get_linux_version() >= 6012000 &&
> +				sysfs_set_num(sra, NULL, "new_level",
> info->new_level) < 0)
> +			err = errno;

Hi Xiao,
I realized that we would do this better by checking existence of new_level
sysfs file. This way, our solution is limited to kernel > 6.12 so, for example
redhat 9 with kernel 5.14 will never pass the condition. I know that you fixed
test issue but someone still may find this in real life.

I'm not going to rework it myself, I'm fine with current approach until
someone will report issue about that for older kernel.

If you are going to rework this, please left a comment about kernel version
that it was added, to let future maintainers know when the additional
verification can be removed.

Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V2 1/1] mdadm/Grow: Update new level when starting reshape
  2024-09-25  7:51   ` Mariusz Tkaczyk
@ 2024-09-25 12:57     ` Xiao Ni
  0 siblings, 0 replies; 16+ messages in thread
From: Xiao Ni @ 2024-09-25 12:57 UTC (permalink / raw)
  To: Mariusz Tkaczyk; +Cc: linux-raid, ncroxon

On Wed, Sep 25, 2024 at 3:51 PM Mariusz Tkaczyk
<mariusz.tkaczyk@linux.intel.com> wrote:
>
> On Wed, 11 Sep 2024 16:54:23 +0800
> Xiao Ni <xni@redhat.com> wrote:
>
> > +
> > +             /* new_level is introduced in kernel 6.12 */
> > +             if (!err && get_linux_version() >= 6012000 &&
> > +                             sysfs_set_num(sra, NULL, "new_level",
> > info->new_level) < 0)
> > +                     err = errno;
>
> Hi Xiao,
> I realized that we would do this better by checking existence of new_level
> sysfs file. This way, our solution is limited to kernel > 6.12 so, for example
> redhat 9 with kernel 5.14 will never pass the condition. I know that you fixed
> test issue but someone still may find this in real life.
>
> I'm not going to rework it myself, I'm fine with current approach until
> someone will report issue about that for older kernel.
>
> If you are going to rework this, please left a comment about kernel version
> that it was added, to let future maintainers know when the additional
> verification can be removed.


Hi Mariusz

Thanks for pointing this out. You're right. I'll fix it.

Regards
Xiao
>
> Thanks,
> Mariusz
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-09-25 12:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-11  8:54 [PATCH 00/10] mdadm tests fix Xiao Ni
2024-09-11  8:54 ` [PATCH V2 1/1] mdadm/Grow: Update new level when starting reshape Xiao Ni
2024-09-25  7:51   ` Mariusz Tkaczyk
2024-09-25 12:57     ` Xiao Ni
2024-09-11  8:54 ` [PATCH V2 2/2] mdadm/Grow: Update reshape_progress to need_back after reshape finishes Xiao Ni
2024-09-11  8:54 ` [PATCH 03/10] mdadm/Grow: Can't open raid when running --grow --continue Xiao Ni
2024-09-11  8:54 ` [PATCH v2 4/4] mdadm/Grow: sleep a while after removing disk in impose_level Xiao Ni
2024-09-11  8:54 ` [PATCH 05/10] mdadm/tests: wait until level changes Xiao Ni
2024-09-11  8:54 ` [PATCH 06/10] mdadm/tests: 07changelevels fix Xiao Ni
2024-09-11  8:54 ` [PATCH 07/10] mdadm/tests: Remove 07reshape5intr.broken Xiao Ni
2024-09-11  8:54 ` [PATCH 08/10] mdadm/tests: 07testreshape5 fix Xiao Ni
2024-09-11  8:54 ` [PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken Xiao Ni
2024-09-23  9:08   ` Mariusz Tkaczyk
2024-09-11  8:54 ` [PATCH 10/10] mdadm/Manage: record errno Xiao Ni
2024-09-23  8:51 ` [PATCH 00/10] mdadm tests fix Mariusz Tkaczyk
  -- strict thread matches above, loose matches on Subject: below --
2024-08-28  2:11 Xiao Ni
2024-08-28  2:11 ` [PATCH 09/10] mdadm/tests: remove 09imsm-assemble.broken Xiao Ni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).