* [PATCH mdadm v2 01/14] Makefile: Don't build static build with everything and everything-test
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-28 7:00 ` Mariusz Tkaczyk
2022-06-22 20:25 ` [PATCH mdadm v2 02/14] DDF: Cleanup validate_geometry_ddf_container() Logan Gunthorpe
` (14 subsequent siblings)
15 siblings, 1 reply; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe
Running the test suite requires building everything, but it seems to be
difficult to build the static version of mdadm now seeing there
is no readily available static udev library.
The test suite doesn't need the static binary so just don't build it
with the everything or everything-test targets.
Leave the mdadm.static and install-static targets in place in case
someone still has a use case for the static binary.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Makefile b/Makefile
index bf126033b841..ec1f99ed5d83 100644
--- a/Makefile
+++ b/Makefile
@@ -182,9 +182,9 @@ check_rundir:
echo "***** or set CHECK_RUN_DIR=0"; exit 1; \
fi
-everything: all mdadm.static swap_super test_stripe raid6check \
+everything: all swap_super test_stripe raid6check \
mdadm.Os mdadm.O2 man
-everything-test: all mdadm.static swap_super test_stripe \
+everything-test: all swap_super test_stripe \
mdadm.Os mdadm.O2 man
# mdadm.uclibc doesn't work on x86-64
# mdadm.tcc doesn't work..
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH mdadm v2 01/14] Makefile: Don't build static build with everything and everything-test
2022-06-22 20:25 ` [PATCH mdadm v2 01/14] Makefile: Don't build static build with everything and everything-test Logan Gunthorpe
@ 2022-06-28 7:00 ` Mariusz Tkaczyk
0 siblings, 0 replies; 23+ messages in thread
From: Mariusz Tkaczyk @ 2022-06-28 7:00 UTC (permalink / raw)
To: Logan Gunthorpe
Cc: linux-raid, Jes Sorensen, Song Liu, Christoph Hellwig,
Donald Buczek, Guoqing Jiang, Xiao Ni, Himanshu Madhani, Coly Li,
Bruce Dubbs, Stephen Bates, Martin Oliveira, David Sloan
On Wed, 22 Jun 2022 14:25:06 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:
> Running the test suite requires building everything, but it seems to be
> difficult to build the static version of mdadm now seeing there
> is no readily available static udev library.
>
> The test suite doesn't need the static binary so just don't build it
> with the everything or everything-test targets.
>
> Leave the mdadm.static and install-static targets in place in case
> someone still has a use case for the static binary.
>
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> ---
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH mdadm v2 02/14] DDF: Cleanup validate_geometry_ddf_container()
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 01/14] Makefile: Don't build static build with everything and everything-test Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 03/14] DDF: Fix NULL pointer dereference in validate_geometry_ddf() Logan Gunthorpe
` (13 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe
Move the function up so that the function declaration is not necessary
and remove the unused arguments to the function.
No functional changes are intended but will help with a bug fix in the
next patch.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
---
super-ddf.c | 88 ++++++++++++++++++++++++-----------------------------
1 file changed, 39 insertions(+), 49 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index abbc8b09c617..9d867f6910f3 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -503,13 +503,6 @@ struct ddf_super {
static int load_super_ddf_all(struct supertype *st, int fd,
void **sbp, char *devname);
static int get_svd_state(const struct ddf_super *, const struct vcl *);
-static int
-validate_geometry_ddf_container(struct supertype *st,
- int level, int layout, int raiddisks,
- int chunk, unsigned long long size,
- unsigned long long data_offset,
- char *dev, unsigned long long *freesize,
- int verbose);
static int validate_geometry_ddf_bvd(struct supertype *st,
int level, int layout, int raiddisks,
@@ -3322,6 +3315,42 @@ static int reserve_space(struct supertype *st, int raiddisks,
return 1;
}
+static int
+validate_geometry_ddf_container(struct supertype *st,
+ int level, int raiddisks,
+ unsigned long long data_offset,
+ char *dev, unsigned long long *freesize,
+ int verbose)
+{
+ int fd;
+ unsigned long long ldsize;
+
+ if (level != LEVEL_CONTAINER)
+ return 0;
+ if (!dev)
+ return 1;
+
+ fd = dev_open(dev, O_RDONLY|O_EXCL);
+ if (fd < 0) {
+ if (verbose)
+ pr_err("ddf: Cannot open %s: %s\n",
+ dev, strerror(errno));
+ return 0;
+ }
+ if (!get_dev_size(fd, dev, &ldsize)) {
+ close(fd);
+ return 0;
+ }
+ close(fd);
+ if (freesize) {
+ *freesize = avail_size_ddf(st, ldsize >> 9, INVALID_SECTORS);
+ if (*freesize == 0)
+ return 0;
+ }
+
+ return 1;
+}
+
static int validate_geometry_ddf(struct supertype *st,
int level, int layout, int raiddisks,
int *chunk, unsigned long long size,
@@ -3347,11 +3376,9 @@ static int validate_geometry_ddf(struct supertype *st,
level = LEVEL_CONTAINER;
if (level == LEVEL_CONTAINER) {
/* Must be a fresh device to add to a container */
- return validate_geometry_ddf_container(st, level, layout,
- raiddisks, *chunk,
- size, data_offset, dev,
- freesize,
- verbose);
+ return validate_geometry_ddf_container(st, level, raiddisks,
+ data_offset, dev,
+ freesize, verbose);
}
if (!dev) {
@@ -3449,43 +3476,6 @@ static int validate_geometry_ddf(struct supertype *st,
return 1;
}
-static int
-validate_geometry_ddf_container(struct supertype *st,
- int level, int layout, int raiddisks,
- int chunk, unsigned long long size,
- unsigned long long data_offset,
- char *dev, unsigned long long *freesize,
- int verbose)
-{
- int fd;
- unsigned long long ldsize;
-
- if (level != LEVEL_CONTAINER)
- return 0;
- if (!dev)
- return 1;
-
- fd = dev_open(dev, O_RDONLY|O_EXCL);
- if (fd < 0) {
- if (verbose)
- pr_err("ddf: Cannot open %s: %s\n",
- dev, strerror(errno));
- return 0;
- }
- if (!get_dev_size(fd, dev, &ldsize)) {
- close(fd);
- return 0;
- }
- close(fd);
- if (freesize) {
- *freesize = avail_size_ddf(st, ldsize >> 9, INVALID_SECTORS);
- if (*freesize == 0)
- return 0;
- }
-
- return 1;
-}
-
static int validate_geometry_ddf_bvd(struct supertype *st,
int level, int layout, int raiddisks,
int *chunk, unsigned long long size,
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 03/14] DDF: Fix NULL pointer dereference in validate_geometry_ddf()
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 01/14] Makefile: Don't build static build with everything and everything-test Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 02/14] DDF: Cleanup validate_geometry_ddf_container() Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 04/14] mdadm/Grow: Fix use after close bug by closing after fork Logan Gunthorpe
` (12 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe
A relatively recent patch added a call to validate_geometry() in
Manage_add() that has level=LEVEL_CONTAINER and chunk=NULL.
This causes some ddf tests to segfault which aborts the test suite.
To fix this, avoid dereferencing chunk when the level is
LEVEL_CONTAINER or LEVEL_NONE.
Fixes: 1f5d54a06df0 ("Manage: Call validate_geometry when adding drive to external container")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
---
super-ddf.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/super-ddf.c b/super-ddf.c
index 9d867f6910f3..949e7d155474 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -3369,9 +3369,6 @@ static int validate_geometry_ddf(struct supertype *st,
* If given BVDs, we make an SVD, changing all the GUIDs in the process.
*/
- if (*chunk == UnSet)
- *chunk = DEFAULT_CHUNK;
-
if (level == LEVEL_NONE)
level = LEVEL_CONTAINER;
if (level == LEVEL_CONTAINER) {
@@ -3381,6 +3378,9 @@ static int validate_geometry_ddf(struct supertype *st,
freesize, verbose);
}
+ if (*chunk == UnSet)
+ *chunk = DEFAULT_CHUNK;
+
if (!dev) {
mdu_array_info_t array = {
.level = level,
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 04/14] mdadm/Grow: Fix use after close bug by closing after fork
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (2 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 03/14] DDF: Fix NULL pointer dereference in validate_geometry_ddf() Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-28 7:02 ` Mariusz Tkaczyk
2022-06-22 20:25 ` [PATCH mdadm v2 05/14] monitor: Avoid segfault when calling NULL get_bad_blocks Logan Gunthorpe
` (11 subsequent siblings)
15 siblings, 1 reply; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
Alex Wu, BingJing Chang, Danny Shih, ChangSyun Peng
The test 07reshape-grow fails most of the time. But it succeeds around
1 in 5 times. When it does succeed, it causes the tests to die because
mdadm has segfaulted.
The segfault was caused by mdadm attempting to repoen a file
descriptor that was already closed. The backtrace of the segfault
was:
#0 __strncmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:101
#1 0x000056146e31d44b in devnm2devid (devnm=0x0) at util.c:956
#2 0x000056146e31dab4 in open_dev_flags (devnm=0x0, flags=0)
at util.c:1072
#3 0x000056146e31db22 in open_dev (devnm=0x0) at util.c:1079
#4 0x000056146e3202e8 in reopen_mddev (mdfd=4) at util.c:2244
#5 0x000056146e329f36 in start_array (mdfd=4,
mddev=0x7ffc55342450 "/dev/md0", content=0x7ffc55342860,
st=0x56146fc78660, ident=0x7ffc55342f70, best=0x56146fc6f5d0,
bestcnt=10, chosen_drive=0, devices=0x56146fc706b0, okcnt=5,
sparecnt=0, rebuilding_cnt=0, journalcnt=0, c=0x7ffc55342e90,
clean=1, avail=0x56146fc78720 "\001\001\001\001\001",
start_partial_ok=0, err_ok=0, was_forced=0)
at Assemble.c:1206
#6 0x000056146e32c36e in Assemble (st=0x56146fc78660,
mddev=0x7ffc55342450 "/dev/md0", ident=0x7ffc55342f70,
devlist=0x56146fc6e2d0, c=0x7ffc55342e90)
at Assemble.c:1914
#7 0x000056146e312ac9 in main (argc=11, argv=0x7ffc55343238)
at mdadm.c:1510
The file descriptor was closed early in Grow_continue(). The noted commit
moved the close() call to close the fd above the fork which caused the
parent process to return with a closed fd.
This meant reshape_array() and Grow_continue() would return in the parent
with the fd forked. The fd would eventually be passed to reopen_mddev()
which returned an unhandled NULL from fd2devnm() which would then be
dereferenced in devnm2devid.
Fix this by moving the close() call below the fork. This appears to
fix the 07revert-grow test. While we're at it, switch to using
close_fd() to invalidate the file descriptor.
Fixes: 77b72fa82813 ("mdadm/Grow: prevent md's fd from being occupied during delayed time")
Cc: Alex Wu <alexwu@synology.com>
Cc: BingJing Chang <bingjingc@synology.com>
Cc: Danny Shih <dannyshih@synology.com>
Cc: ChangSyun Peng <allenpeng@synology.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
Grow.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/Grow.c b/Grow.c
index f6efbc48dafd..0e2d7181bcab 100644
--- a/Grow.c
+++ b/Grow.c
@@ -3514,7 +3514,6 @@ started:
return 0;
}
- close(fd);
/* Now we just need to kick off the reshape and watch, while
* handling backups of the data...
* This is all done by a forked background process.
@@ -3535,6 +3534,9 @@ started:
break;
}
+ /* Close unused file descriptor in the forked process */
+ close_fd(&fd);
+
/* If another array on the same devices is busy, the
* reshape will wait for them. This would mean that
* the first section that we suspend will stay suspended
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH mdadm v2 04/14] mdadm/Grow: Fix use after close bug by closing after fork
2022-06-22 20:25 ` [PATCH mdadm v2 04/14] mdadm/Grow: Fix use after close bug by closing after fork Logan Gunthorpe
@ 2022-06-28 7:02 ` Mariusz Tkaczyk
0 siblings, 0 replies; 23+ messages in thread
From: Mariusz Tkaczyk @ 2022-06-28 7:02 UTC (permalink / raw)
To: Logan Gunthorpe
Cc: linux-raid, Jes Sorensen, Song Liu, Christoph Hellwig,
Donald Buczek, Guoqing Jiang, Xiao Ni, Himanshu Madhani, Coly Li,
Bruce Dubbs, Stephen Bates, Martin Oliveira, David Sloan, Alex Wu,
BingJing Chang, Danny Shih, ChangSyun Peng
On Wed, 22 Jun 2022 14:25:09 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:
> The test 07reshape-grow fails most of the time. But it succeeds around
> 1 in 5 times. When it does succeed, it causes the tests to die because
> mdadm has segfaulted.
>
> The segfault was caused by mdadm attempting to repoen a file
> descriptor that was already closed. The backtrace of the segfault
> was:
>
> #0 __strncmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:101
> #1 0x000056146e31d44b in devnm2devid (devnm=0x0) at util.c:956
> #2 0x000056146e31dab4 in open_dev_flags (devnm=0x0, flags=0)
> at util.c:1072
> #3 0x000056146e31db22 in open_dev (devnm=0x0) at util.c:1079
> #4 0x000056146e3202e8 in reopen_mddev (mdfd=4) at util.c:2244
> #5 0x000056146e329f36 in start_array (mdfd=4,
> mddev=0x7ffc55342450 "/dev/md0", content=0x7ffc55342860,
> st=0x56146fc78660, ident=0x7ffc55342f70, best=0x56146fc6f5d0,
> bestcnt=10, chosen_drive=0, devices=0x56146fc706b0, okcnt=5,
> sparecnt=0, rebuilding_cnt=0, journalcnt=0, c=0x7ffc55342e90,
> clean=1, avail=0x56146fc78720 "\001\001\001\001\001",
> start_partial_ok=0, err_ok=0, was_forced=0)
> at Assemble.c:1206
> #6 0x000056146e32c36e in Assemble (st=0x56146fc78660,
> mddev=0x7ffc55342450 "/dev/md0", ident=0x7ffc55342f70,
> devlist=0x56146fc6e2d0, c=0x7ffc55342e90)
> at Assemble.c:1914
> #7 0x000056146e312ac9 in main (argc=11, argv=0x7ffc55343238)
> at mdadm.c:1510
>
> The file descriptor was closed early in Grow_continue(). The noted commit
> moved the close() call to close the fd above the fork which caused the
> parent process to return with a closed fd.
>
> This meant reshape_array() and Grow_continue() would return in the parent
> with the fd forked. The fd would eventually be passed to reopen_mddev()
> which returned an unhandled NULL from fd2devnm() which would then be
> dereferenced in devnm2devid.
>
> Fix this by moving the close() call below the fork. This appears to
> fix the 07revert-grow test. While we're at it, switch to using
> close_fd() to invalidate the file descriptor.
>
> Fixes: 77b72fa82813 ("mdadm/Grow: prevent md's fd from being occupied during
> delayed time") Cc: Alex Wu <alexwu@synology.com>
> Cc: BingJing Chang <bingjingc@synology.com>
> Cc: Danny Shih <dannyshih@synology.com>
> Cc: ChangSyun Peng <allenpeng@synology.com>
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> ---
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH mdadm v2 05/14] monitor: Avoid segfault when calling NULL get_bad_blocks
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (3 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 04/14] mdadm/Grow: Fix use after close bug by closing after fork Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 06/14] mdadm: Fix mdadm -r remove option regression Logan Gunthorpe
` (10 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe
Not all struct superswitch implement a get_bad_blocks() function,
yet mdmon seems to call it without checking for NULL and thus
occasionally segfaults in the test 10ddf-geometry.
Fix this by checking for NULL before calling it.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
---
monitor.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/monitor.c b/monitor.c
index b877e595c998..820a93d0ceaf 100644
--- a/monitor.c
+++ b/monitor.c
@@ -311,6 +311,9 @@ static int check_for_cleared_bb(struct active_array *a, struct mdinfo *mdi)
struct md_bb *bb;
int i;
+ if (!ss->get_bad_blocks)
+ return -1;
+
/*
* Get a list of bad blocks for an array, then read list of
* acknowledged bad blocks from kernel and compare it against metadata
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 06/14] mdadm: Fix mdadm -r remove option regression
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (4 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 05/14] monitor: Avoid segfault when calling NULL get_bad_blocks Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-28 7:03 ` Mariusz Tkaczyk
2022-06-22 20:25 ` [PATCH mdadm v2 07/14] mdadm: Fix optional --write-behind parameter Logan Gunthorpe
` (9 subsequent siblings)
15 siblings, 1 reply; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
Wu Guanghao
The commit noted below globally adds a parameter to the -r option but missed
the fact that -r is used for another purpose: --remove.
After that commit, a command such as:
mdadm /dev/md0 -r /dev/loop0
will do nothing seeing the device parameter will be consumed as a
argument to the -r option; thus, there will only be one device
seen one the command line, devs_found will only be 1 and nothing will
happen.
This caused the 01r5integ and 01raid6integ tests to hang indefinitely
as mdadm did not remove the failed device. With the device not removed,
it would not be readded. Then the loop waiting for the array status to
change would loop forever.
This commit was recently reverted, but the legitimate fix for the
monitor operations was still not fixed. So add specific monitor
short ops to re-fix the --monitor -r option.
Fixes: 546047688e1c ("mdadm: fix coredump of mdadm --monitor -r")
Fixes: 190dc029b141 ("Revert "mdadm: fix coredump of mdadm --monitor -r"")
Cc: Wu Guanghao <wuguanghao3@huawei.com>
Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
ReadMe.c | 1 +
mdadm.c | 1 +
mdadm.h | 1 +
3 files changed, 3 insertions(+)
diff --git a/ReadMe.c b/ReadMe.c
index bec1be9ab26f..7518a32a9869 100644
--- a/ReadMe.c
+++ b/ReadMe.c
@@ -82,6 +82,7 @@ char Version[] = "mdadm - v" VERSION " - " VERS_DATE EXTRAVERSION "\n";
*/
char short_options[]="-ABCDEFGIQhVXYWZ:vqbc:i:l:p:m:n:x:u:c:d:z:U:N:sarfRSow1tye:k:";
+char short_monitor_options[]="-ABCDEFGIQhVXYWZ:vqbc:i:l:p:m:r:n:x:u:c:d:z:U:N:safRSow1tye:k:";
char short_bitmap_options[]=
"-ABCDEFGIQhVXYWZ:vqb:c:i:l:p:m:n:x:u:c:d:z:U:N:sarfRSow1tye:k:";
char short_bitmap_auto_options[]=
diff --git a/mdadm.c b/mdadm.c
index be40686cf91b..d0c5e6def901 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -227,6 +227,7 @@ int main(int argc, char *argv[])
shortopt = short_bitmap_auto_options;
break;
case 'F': newmode = MONITOR;
+ shortopt = short_monitor_options;
break;
case 'G': newmode = GROW;
shortopt = short_bitmap_options;
diff --git a/mdadm.h b/mdadm.h
index d53df1697f88..05ef881f4709 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -419,6 +419,7 @@ enum mode {
};
extern char short_options[];
+extern char short_monitor_options[];
extern char short_bitmap_options[];
extern char short_bitmap_auto_options[];
extern struct option long_options[];
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH mdadm v2 06/14] mdadm: Fix mdadm -r remove option regression
2022-06-22 20:25 ` [PATCH mdadm v2 06/14] mdadm: Fix mdadm -r remove option regression Logan Gunthorpe
@ 2022-06-28 7:03 ` Mariusz Tkaczyk
0 siblings, 0 replies; 23+ messages in thread
From: Mariusz Tkaczyk @ 2022-06-28 7:03 UTC (permalink / raw)
To: Logan Gunthorpe
Cc: linux-raid, Jes Sorensen, Song Liu, Christoph Hellwig,
Donald Buczek, Guoqing Jiang, Xiao Ni, Himanshu Madhani, Coly Li,
Bruce Dubbs, Stephen Bates, Martin Oliveira, David Sloan,
Wu Guanghao
On Wed, 22 Jun 2022 14:25:11 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:
> The commit noted below globally adds a parameter to the -r option but missed
> the fact that -r is used for another purpose: --remove.
>
> After that commit, a command such as:
>
> mdadm /dev/md0 -r /dev/loop0
>
> will do nothing seeing the device parameter will be consumed as a
> argument to the -r option; thus, there will only be one device
> seen one the command line, devs_found will only be 1 and nothing will
> happen.
>
> This caused the 01r5integ and 01raid6integ tests to hang indefinitely
> as mdadm did not remove the failed device. With the device not removed,
> it would not be readded. Then the loop waiting for the array status to
> change would loop forever.
>
> This commit was recently reverted, but the legitimate fix for the
> monitor operations was still not fixed. So add specific monitor
> short ops to re-fix the --monitor -r option.
>
> Fixes: 546047688e1c ("mdadm: fix coredump of mdadm --monitor -r")
> Fixes: 190dc029b141 ("Revert "mdadm: fix coredump of mdadm --monitor -r"")
> Cc: Wu Guanghao <wuguanghao3@huawei.com>
> Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH mdadm v2 07/14] mdadm: Fix optional --write-behind parameter
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (5 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 06/14] mdadm: Fix mdadm -r remove option regression Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 08/14] tests/00raid0: add a test that validates raid0 with layout fails for 0.9 Logan Gunthorpe
` (8 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
Mateusz Grzonka
The commit noted below changed the behaviour of --write-behind to
require an argument. This broke the 06wrmostly test with the error:
mdadm: Invalid value for maximum outstanding write-behind writes: (null).
Must be between 0 and 16383.
To fix this, check if optarg is NULL before parising it, as the origial
code did.
Fixes: 60815698c0ac ("Refactor parse_num and use it to parse optarg.")
Cc: Mateusz Grzonka <mateusz.grzonka@intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
---
mdadm.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mdadm.c b/mdadm.c
index d0c5e6def901..56722ed997a2 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -1201,8 +1201,9 @@ int main(int argc, char *argv[])
case O(BUILD, WriteBehind):
case O(CREATE, WriteBehind):
s.write_behind = DEFAULT_MAX_WRITE_BEHIND;
- if (parse_num(&s.write_behind, optarg) != 0 ||
- s.write_behind < 0 || s.write_behind > 16383) {
+ if (optarg &&
+ (parse_num(&s.write_behind, optarg) != 0 ||
+ s.write_behind < 0 || s.write_behind > 16383)) {
pr_err("Invalid value for maximum outstanding write-behind writes: %s.\n\tMust be between 0 and 16383.\n",
optarg);
exit(2);
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 08/14] tests/00raid0: add a test that validates raid0 with layout fails for 0.9
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (6 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 07/14] mdadm: Fix optional --write-behind parameter Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 09/14] tests: fix raid0 tests for 0.90 metadata Logan Gunthorpe
` (7 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan,
Sudhakar Panneerselvam, Logan Gunthorpe
From: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
329dfc28debb disallows the creation of raid0 with layouts for 0.9
metadata. This test confirms the new behavior.
Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
tests/00raid0 | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/tests/00raid0 b/tests/00raid0
index 8bc18985f91a..e6b21cc419eb 100644
--- a/tests/00raid0
+++ b/tests/00raid0
@@ -6,11 +6,9 @@ check raid0
testdev $md0 3 $mdsize2_l 512
mdadm -S $md0
-# now with version-0.90 superblock
+# verify raid0 with layouts fail for 0.90
mdadm -CR $md0 -e0.90 -l0 -n4 $dev0 $dev1 $dev2 $dev3
-check raid0
-testdev $md0 4 $mdsize0 512
-mdadm -S $md0
+check opposite_result
# now with no superblock
mdadm -B $md0 -l0 -n5 $dev0 $dev1 $dev2 $dev3 $dev4
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 09/14] tests: fix raid0 tests for 0.90 metadata
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (7 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 08/14] tests/00raid0: add a test that validates raid0 with layout fails for 0.9 Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 10/14] tests/04update-metadata: avoid passing chunk size to raid1 Logan Gunthorpe
` (6 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan,
Sudhakar Panneerselvam, Logan Gunthorpe
From: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Some of the test cases fail because raid0 creation fails with the error,
"0.90 metadata does not support layouts for RAID0" added by commit,
329dfc28debb. Fix some of the test cases by switching from raid0 to
linear level for 0.9 metadata where possible.
Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
tests/00raid0 | 4 ++--
tests/00readonly | 4 ++++
tests/03r0assem | 6 +++---
tests/04r0update | 4 ++--
tests/04update-metadata | 2 +-
5 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/tests/00raid0 b/tests/00raid0
index e6b21cc419eb..9b8896cbdc52 100644
--- a/tests/00raid0
+++ b/tests/00raid0
@@ -20,8 +20,8 @@ mdadm -S $md0
# now same again with different chunk size
for chunk in 4 32 256
do
- mdadm -CR $md0 -e0.90 -l raid0 --chunk $chunk -n3 $dev0 $dev1 $dev2
- check raid0
+ mdadm -CR $md0 -e0.90 -l linear --chunk $chunk -n3 $dev0 $dev1 $dev2
+ check linear
testdev $md0 3 $mdsize0 $chunk
mdadm -S $md0
diff --git a/tests/00readonly b/tests/00readonly
index 28b0fa13f815..39202487f614 100644
--- a/tests/00readonly
+++ b/tests/00readonly
@@ -4,6 +4,10 @@ for metadata in 0.9 1.0 1.1 1.2
do
for level in linear raid0 raid1 raid4 raid5 raid6 raid10
do
+ if [[ $metadata == "0.9" && $level == "raid0" ]];
+ then
+ continue
+ fi
mdadm -CR $md0 -l $level -n 4 --metadata=$metadata \
$dev1 $dev2 $dev3 $dev4 --assume-clean
check nosync
diff --git a/tests/03r0assem b/tests/03r0assem
index 6744e3221062..44df06456233 100644
--- a/tests/03r0assem
+++ b/tests/03r0assem
@@ -68,9 +68,9 @@ mdadm -S $md2
### Now for version 0...
mdadm --zero-superblock $dev0 $dev1 $dev2
-mdadm -CR $md2 -l0 --metadata=0.90 -n3 $dev0 $dev1 $dev2
-check raid0
-tst="testdev $md2 3 $mdsize0 512"
+mdadm -CR $md2 -llinear --metadata=0.90 -n3 $dev0 $dev1 $dev2
+check linear
+tst="testdev $md2 3 $mdsize0 1"
$tst
uuid=`mdadm -Db $md2 | sed 's/.*UUID=//'`
diff --git a/tests/04r0update b/tests/04r0update
index 73ee3b9fed91..b95efb06c761 100644
--- a/tests/04r0update
+++ b/tests/04r0update
@@ -1,7 +1,7 @@
# create a raid0, re-assemble with a different super-minor
-mdadm -CR -e 0.90 $md0 -l0 -n3 $dev0 $dev1 $dev2
-testdev $md0 3 $mdsize0 512
+mdadm -CR -e 0.90 $md0 -llinear -n3 $dev0 $dev1 $dev2
+testdev $md0 3 $mdsize0 1
minor1=`mdadm -E $dev0 | sed -n -e 's/.*Preferred Minor : //p'`
mdadm -S /dev/md0
diff --git a/tests/04update-metadata b/tests/04update-metadata
index 232fc1ffff4b..08c14af7ed29 100644
--- a/tests/04update-metadata
+++ b/tests/04update-metadata
@@ -8,7 +8,7 @@ set -xe
dlist="$dev0 $dev1 $dev2 $dev3"
-for ls in raid0/4 linear/4 raid1/1 raid5/3 raid6/2
+for ls in linear/4 raid1/1 raid5/3 raid6/2
do
s=${ls#*/} l=${ls%/*}
mdadm -CR --assume-clean -e 0.90 $md0 --level $l -n 4 -c 64 $dlist
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 10/14] tests/04update-metadata: avoid passing chunk size to raid1
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (8 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 09/14] tests: fix raid0 tests for 0.90 metadata Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 11/14] tests/02lineargrow: clear the superblock at every iteration Logan Gunthorpe
` (5 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan,
Sudhakar Panneerselvam, Logan Gunthorpe
From: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
'04update-metadata' test fails with error, "specifying chunk size is
forbidden for this level" added by commit, 5b30a34aa4b5e. Hence,
correcting the test to ignore passing chunk size to raid1.
Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
[logang@deltatee.com: fix if/then style and dropped unrelated hunk]
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
tests/04update-metadata | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tests/04update-metadata b/tests/04update-metadata
index 08c14af7ed29..2b72a303b6a0 100644
--- a/tests/04update-metadata
+++ b/tests/04update-metadata
@@ -11,7 +11,11 @@ dlist="$dev0 $dev1 $dev2 $dev3"
for ls in linear/4 raid1/1 raid5/3 raid6/2
do
s=${ls#*/} l=${ls%/*}
- mdadm -CR --assume-clean -e 0.90 $md0 --level $l -n 4 -c 64 $dlist
+ if [[ $l == 'raid1' ]]; then
+ mdadm -CR --assume-clean -e 0.90 $md0 --level $l -n 4 $dlist
+ else
+ mdadm -CR --assume-clean -e 0.90 $md0 --level $l -n 4 -c 64 $dlist
+ fi
testdev $md0 $s 19904 64
mdadm -S $md0
mdadm -A $md0 --update=metadata $dlist
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 11/14] tests/02lineargrow: clear the superblock at every iteration
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (9 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 10/14] tests/04update-metadata: avoid passing chunk size to raid1 Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 12/14] mdadm/test: Add a mode to repeat specified tests Logan Gunthorpe
` (4 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan,
Sudhakar Panneerselvam, Logan Gunthorpe
From: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
This fixes 02lineargrow test as prior metadata causes --add operation
to misbehave.
Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
tests/02lineargrow | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tests/02lineargrow b/tests/02lineargrow
index e05c219d113a..595bf9f20802 100644
--- a/tests/02lineargrow
+++ b/tests/02lineargrow
@@ -20,4 +20,6 @@ do
testdev $md0 3 $sz 1
mdadm -S $md0
+ mdadm --zero /dev/loop2
+ mdadm --zero /dev/loop3
done
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 12/14] mdadm/test: Add a mode to repeat specified tests
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (10 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 11/14] tests/02lineargrow: clear the superblock at every iteration Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 13/14] mdadm/test: Mark and ignore broken test failures Logan Gunthorpe
` (3 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe
Many tests fail infrequently or rarely. To help find these, add
an option to run the tests multiple times by specifying --loop=N.
If --loop=0 is specified, the test will be looped forever.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
test | 36 ++++++++++++++++++++++++------------
1 file changed, 24 insertions(+), 12 deletions(-)
diff --git a/test b/test
index 711a3c7a2076..da6db5e0f631 100755
--- a/test
+++ b/test
@@ -10,6 +10,7 @@ devlist=
savelogs=0
exitonerror=1
+loop=1
prefix='[0-9][0-9]'
# use loop devices by default if doesn't specify --dev
@@ -117,6 +118,7 @@ do_help() {
--logdir=directory Directory to save all logfiles in
--save-logs Usually use with --logdir together
--keep-going | --no-error Don't stop on error, ie. run all tests
+ --loop=N Run tests N times (0 to run forever)
--dev=loop|lvm|ram|disk Use loop devices (default), LVM, RAM or disk
--disks= Provide a bunch of physical devices for test
--volgroup=name LVM volume group for LVM test
@@ -211,6 +213,9 @@ parse_args() {
--keep-going | --no-error )
exitonerror=0
;;
+ --loop=* )
+ loop="${i##*=}"
+ ;;
--disable-multipath )
unset MULTIPATH
;;
@@ -263,19 +268,26 @@ main() {
echo "Testing on linux-$(uname -r) kernel"
[ "$savelogs" == "1" ] &&
echo "Saving logs to $logdir"
- if [ "x$TESTLIST" != "x" ]
- then
- for script in ${TESTLIST[@]}
- do
- do_test $testdir/$script
- done
- else
- for script in $testdir/$prefix $testdir/$prefix*[^~]
- do
- do_test $script
- done
- fi
+ while true; do
+ if [ "x$TESTLIST" != "x" ]
+ then
+ for script in ${TESTLIST[@]}
+ do
+ do_test $testdir/$script
+ done
+ else
+ for script in $testdir/$prefix $testdir/$prefix*[^~]
+ do
+ do_test $script
+ done
+ fi
+
+ let loop=$loop-1
+ if [ "$loop" == "0" ]; then
+ break
+ fi
+ done
exit 0
}
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 13/14] mdadm/test: Mark and ignore broken test failures
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (11 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 12/14] mdadm/test: Add a mode to repeat specified tests Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-06-22 20:25 ` [PATCH mdadm v2 14/14] tests: Add broken files for all broken tests Logan Gunthorpe
` (2 subsequent siblings)
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe
Add functionality to continue if a test marked as broken fails.
To mark a test as broken, a file with the same name but with the suffix
'.broken' should exist. The first line in the file will be printed with
a KNOWN BROKEN message; the rest of the file can describe the how the
test is broken.
Also adds --skip-broken and --skip-always-broken to skip all the tests
that have a .broken file or to skip all tests whose .broken file's first
line contains the keyword always.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
test | 37 +++++++++++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
diff --git a/test b/test
index da6db5e0f631..61d9ee83226e 100755
--- a/test
+++ b/test
@@ -10,6 +10,8 @@ devlist=
savelogs=0
exitonerror=1
+ctrl_c_error=0
+skipbroken=0
loop=1
prefix='[0-9][0-9]'
@@ -36,6 +38,7 @@ die() {
ctrl_c() {
exitonerror=1
+ ctrl_c_error=1
}
# mdadm always adds --quiet, and we want to see any unexpected messages
@@ -80,8 +83,21 @@ mdadm() {
do_test() {
_script=$1
_basename=`basename $_script`
+ _broken=0
+
if [ -f "$_script" ]
then
+ if [ -f "${_script}.broken" ]; then
+ _broken=1
+ _broken_msg=$(head -n1 "${_script}.broken" | tr -d '\n')
+ if [ "$skipbroken" == "all" ]; then
+ return
+ elif [ "$skipbroken" == "always" ] &&
+ [[ "$_broken_msg" == *always* ]]; then
+ return
+ fi
+ fi
+
rm -f $targetdir/stderr
# this might have been reset: restore the default.
echo 2000 > /proc/sys/dev/raid/speed_limit_max
@@ -98,10 +114,15 @@ do_test() {
else
save_log fail
_fail=1
+ if [ "$_broken" == "1" ]; then
+ echo " (KNOWN BROKEN TEST: $_broken_msg)"
+ fi
fi
[ "$savelogs" == "1" ] &&
mv -f $targetdir/log $logdir/$_basename.log
- [ "$_fail" == "1" -a "$exitonerror" == "1" ] && exit 1
+ [ "$ctrl_c_error" == "1" ] && exit 1
+ [ "$_fail" == "1" -a "$exitonerror" == "1" \
+ -a "$_broken" == "0" ] && exit 1
fi
}
@@ -119,6 +140,8 @@ do_help() {
--save-logs Usually use with --logdir together
--keep-going | --no-error Don't stop on error, ie. run all tests
--loop=N Run tests N times (0 to run forever)
+ --skip-broken Skip tests that are known to be broken
+ --skip-always-broken Skip tests that are known to always fail
--dev=loop|lvm|ram|disk Use loop devices (default), LVM, RAM or disk
--disks= Provide a bunch of physical devices for test
--volgroup=name LVM volume group for LVM test
@@ -216,6 +239,12 @@ parse_args() {
--loop=* )
loop="${i##*=}"
;;
+ --skip-broken )
+ skipbroken=all
+ ;;
+ --skip-always-broken )
+ skipbroken=always
+ ;;
--disable-multipath )
unset MULTIPATH
;;
@@ -279,7 +308,11 @@ main() {
else
for script in $testdir/$prefix $testdir/$prefix*[^~]
do
- do_test $script
+ case $script in
+ *.broken) ;;
+ *)
+ do_test $script
+ esac
done
fi
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH mdadm v2 14/14] tests: Add broken files for all broken tests
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (12 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 13/14] mdadm/test: Mark and ignore broken test failures Logan Gunthorpe
@ 2022-06-22 20:25 ` Logan Gunthorpe
2022-07-22 17:00 ` [PATCH mdadm v2 00/14] Bug fixes and testing improvments Himanshu Madhani
2022-08-07 20:35 ` Jes Sorensen
15 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-06-22 20:25 UTC (permalink / raw)
To: linux-raid, Jes Sorensen
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe
Each broken file contains the rough frequency of brokeness as well
as a brief explanation of what happens when it breaks. Estimates
of failure rates are not statistically significant and can vary
run to run.
This is really just a view from my window. Tests were done on a
small VM with the default loop devices, not real hardware. We've
seen different kernel configurations can cause bugs to appear as well
(ie. different block schedulers). It may also be that different race
conditions will be seen on machines with different performance
characteristics.
These annotations were done with the kernel currently in md/md-next:
facef3b96c5b ("md: Notify sysfs sync_completed in md_reap_sync_thread()")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
tests/01r5integ.broken | 7 ++++
tests/01raid6integ.broken | 7 ++++
tests/04r5swap.broken | 7 ++++
tests/07autoassemble.broken | 8 ++++
tests/07autodetect.broken | 5 +++
tests/07changelevelintr.broken | 9 +++++
tests/07changelevels.broken | 9 +++++
tests/07reshape5intr.broken | 45 ++++++++++++++++++++++
tests/07revert-grow.broken | 31 +++++++++++++++
tests/07revert-shrink.broken | 9 +++++
tests/07testreshape5.broken | 12 ++++++
tests/09imsm-assemble.broken | 6 +++
tests/09imsm-create-fail-rebuild.broken | 5 +++
tests/09imsm-overlap.broken | 7 ++++
tests/10ddf-assemble-missing.broken | 6 +++
tests/10ddf-fail-create-race.broken | 7 ++++
tests/10ddf-fail-two-spares.broken | 5 +++
tests/10ddf-incremental-wrong-order.broken | 9 +++++
tests/14imsm-r1_2d-grow-r1_3d.broken | 5 +++
tests/14imsm-r1_2d-takeover-r0_2d.broken | 6 +++
tests/18imsm-r10_4d-takeover-r0_2d.broken | 5 +++
tests/18imsm-r1_2d-takeover-r0_1d.broken | 6 +++
tests/19raid6auto-repair.broken | 5 +++
tests/19raid6repair.broken | 5 +++
24 files changed, 226 insertions(+)
create mode 100644 tests/01r5integ.broken
create mode 100644 tests/01raid6integ.broken
create mode 100644 tests/04r5swap.broken
create mode 100644 tests/07autoassemble.broken
create mode 100644 tests/07autodetect.broken
create mode 100644 tests/07changelevelintr.broken
create mode 100644 tests/07changelevels.broken
create mode 100644 tests/07reshape5intr.broken
create mode 100644 tests/07revert-grow.broken
create mode 100644 tests/07revert-shrink.broken
create mode 100644 tests/07testreshape5.broken
create mode 100644 tests/09imsm-assemble.broken
create mode 100644 tests/09imsm-create-fail-rebuild.broken
create mode 100644 tests/09imsm-overlap.broken
create mode 100644 tests/10ddf-assemble-missing.broken
create mode 100644 tests/10ddf-fail-create-race.broken
create mode 100644 tests/10ddf-fail-two-spares.broken
create mode 100644 tests/10ddf-incremental-wrong-order.broken
create mode 100644 tests/14imsm-r1_2d-grow-r1_3d.broken
create mode 100644 tests/14imsm-r1_2d-takeover-r0_2d.broken
create mode 100644 tests/18imsm-r10_4d-takeover-r0_2d.broken
create mode 100644 tests/18imsm-r1_2d-takeover-r0_1d.broken
create mode 100644 tests/19raid6auto-repair.broken
create mode 100644 tests/19raid6repair.broken
diff --git a/tests/01r5integ.broken b/tests/01r5integ.broken
new file mode 100644
index 000000000000..207376372243
--- /dev/null
+++ b/tests/01r5integ.broken
@@ -0,0 +1,7 @@
+fails rarely
+
+Fails about 1 in every 30 runs with a sha mismatch error:
+
+ c49ab26e1b01def7874af9b8a6d6d0c29fdfafe6 /dev/md0 does not match
+ 15dc2f73262f811ada53c65e505ceec9cf025cb9 /dev/md0 with /dev/loop3
+ missing
diff --git a/tests/01raid6integ.broken b/tests/01raid6integ.broken
new file mode 100644
index 000000000000..1df735f08c8c
--- /dev/null
+++ b/tests/01raid6integ.broken
@@ -0,0 +1,7 @@
+fails infrequently
+
+Fails about 1 in 5 with a sha mismatch:
+
+ 8286c2bc045ae2cfe9f8b7ae3a898fa25db6926f /dev/md0 does not match
+ a083a0738b58caab37fd568b91b177035ded37df /dev/md0 with /dev/loop2 and
+ /dev/loop3 missing
diff --git a/tests/04r5swap.broken b/tests/04r5swap.broken
new file mode 100644
index 000000000000..e38987dbf01b
--- /dev/null
+++ b/tests/04r5swap.broken
@@ -0,0 +1,7 @@
+always fails
+
+Fails with errors:
+
+ mdadm: /dev/loop0 has no superblock - assembly aborted
+
+ ERROR: no recovery happening
diff --git a/tests/07autoassemble.broken b/tests/07autoassemble.broken
new file mode 100644
index 000000000000..8be09407f628
--- /dev/null
+++ b/tests/07autoassemble.broken
@@ -0,0 +1,8 @@
+always fails
+
+Prints lots of messages, but the array doesn't assemble. Error
+possibly related to:
+
+ mdadm: /dev/md/1 is busy - skipping
+ mdadm: no recogniseable superblock on /dev/md/testing:0
+ mdadm: /dev/md/2 is busy - skipping
diff --git a/tests/07autodetect.broken b/tests/07autodetect.broken
new file mode 100644
index 000000000000..294954a1f50a
--- /dev/null
+++ b/tests/07autodetect.broken
@@ -0,0 +1,5 @@
+always fails
+
+Fails with error:
+
+ ERROR: no resync happening
diff --git a/tests/07changelevelintr.broken b/tests/07changelevelintr.broken
new file mode 100644
index 000000000000..284b49068295
--- /dev/null
+++ b/tests/07changelevelintr.broken
@@ -0,0 +1,9 @@
+always fails
+
+Fails with errors:
+
+ mdadm: this change will reduce the size of the array.
+ use --grow --array-size first to truncate array.
+ e.g. mdadm --grow /dev/md0 --array-size 56832
+
+ ERROR: no reshape happening
diff --git a/tests/07changelevels.broken b/tests/07changelevels.broken
new file mode 100644
index 000000000000..9b930d932c48
--- /dev/null
+++ b/tests/07changelevels.broken
@@ -0,0 +1,9 @@
+always fails
+
+Fails with errors:
+
+ mdadm: /dev/loop0 is smaller than given size. 18976K < 19968K + metadata
+ mdadm: /dev/loop1 is smaller than given size. 18976K < 19968K + metadata
+ mdadm: /dev/loop2 is smaller than given size. 18976K < 19968K + metadata
+
+ ERROR: /dev/md0 isn't a block device.
diff --git a/tests/07reshape5intr.broken b/tests/07reshape5intr.broken
new file mode 100644
index 000000000000..efe52a667172
--- /dev/null
+++ b/tests/07reshape5intr.broken
@@ -0,0 +1,45 @@
+always fails
+
+This patch, recently added to md-next causes the test to always fail:
+
+7e6ba434cc60 ("md: don't unregister sync_thread with reconfig_mutex
+held")
+
+The new error is simply:
+
+ ERROR: no reshape happening
+
+Before the patch, the error seen is below.
+
+--
+
+fails infrequently
+
+Fails roughly 1 in 4 runs with errors:
+
+ mdadm: Merging with already-assembled /dev/md/0
+ mdadm: cannot re-read metadata from /dev/loop6 - aborting
+
+ ERROR: no reshape happening
+
+Also have seen a random deadlock:
+
+ INFO: task mdadm:109702 blocked for more than 30 seconds.
+ Not tainted 5.18.0-rc3-eid-vmlocalyes-dbg-00095-g3c2b5427979d #2040
+ "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
+ task:mdadm state:D stack: 0 pid:109702 ppid: 1 flags:0x00004000
+ Call Trace:
+ <TASK>
+ __schedule+0x67e/0x13b0
+ schedule+0x82/0x110
+ mddev_suspend+0x2e1/0x330
+ suspend_lo_store+0xbd/0x140
+ md_attr_store+0xcb/0x130
+ sysfs_kf_write+0x89/0xb0
+ kernfs_fop_write_iter+0x202/0x2c0
+ new_sync_write+0x222/0x330
+ vfs_write+0x3bc/0x4d0
+ ksys_write+0xd9/0x180
+ __x64_sys_write+0x43/0x50
+ do_syscall_64+0x3b/0x90
+ entry_SYSCALL_64_after_hwframe+0x44/0xae
diff --git a/tests/07revert-grow.broken b/tests/07revert-grow.broken
new file mode 100644
index 000000000000..9b6db86f60ab
--- /dev/null
+++ b/tests/07revert-grow.broken
@@ -0,0 +1,31 @@
+always fails
+
+This patch, recently added to md-next causes the test to always fail:
+
+7e6ba434cc60 ("md: don't unregister sync_thread with reconfig_mutex held")
+
+The errors are:
+
+ mdadm: No active reshape to revert on /dev/loop0
+ ERROR: active raid5 not found
+
+Before the patch, the error seen is below.
+
+--
+
+fails rarely
+
+Fails about 1 in every 30 runs with errors:
+
+ mdadm: Merging with already-assembled /dev/md/0
+ mdadm: backup file /tmp/md-backup inaccessible: No such file or directory
+ mdadm: failed to add /dev/loop1 to /dev/md/0: Invalid argument
+ mdadm: failed to add /dev/loop2 to /dev/md/0: Invalid argument
+ mdadm: failed to add /dev/loop3 to /dev/md/0: Invalid argument
+ mdadm: failed to add /dev/loop0 to /dev/md/0: Invalid argument
+ mdadm: /dev/md/0 assembled from 1 drive - need all 5 to start it
+ (use --run to insist).
+
+ grep: /sys/block/md*/md/sync_action: No such file or directory
+
+ ERROR: active raid5 not found
diff --git a/tests/07revert-shrink.broken b/tests/07revert-shrink.broken
new file mode 100644
index 000000000000..c33c39ec04f8
--- /dev/null
+++ b/tests/07revert-shrink.broken
@@ -0,0 +1,9 @@
+always fails
+
+Fails with errors:
+
+ mdadm: this change will reduce the size of the array.
+ use --grow --array-size first to truncate array.
+ e.g. mdadm --grow /dev/md0 --array-size 53760
+
+ ERROR: active raid5 not found
diff --git a/tests/07testreshape5.broken b/tests/07testreshape5.broken
new file mode 100644
index 000000000000..a8ce03e491b3
--- /dev/null
+++ b/tests/07testreshape5.broken
@@ -0,0 +1,12 @@
+always fails
+
+Test seems to run 'test_stripe' at $dir directory, but $dir is never
+set. If $dir is adjusted to $PWD, the test still fails with:
+
+ mdadm: /dev/loop2 is not suitable for this array.
+ mdadm: create aborted
+ ++ return 1
+ ++ cmp -s -n 8192 /dev/md0 /tmp/RandFile
+ ++ echo cmp failed
+ cmp failed
+ ++ exit 2
diff --git a/tests/09imsm-assemble.broken b/tests/09imsm-assemble.broken
new file mode 100644
index 000000000000..a6d4d5cf911b
--- /dev/null
+++ b/tests/09imsm-assemble.broken
@@ -0,0 +1,6 @@
+fails infrequently
+
+Fails roughly 1 in 10 runs with errors:
+
+ mdadm: /dev/loop2 is still in use, cannot remove.
+ /dev/loop2 removal from /dev/md/container should have succeeded
diff --git a/tests/09imsm-create-fail-rebuild.broken b/tests/09imsm-create-fail-rebuild.broken
new file mode 100644
index 000000000000..40c4b294da38
--- /dev/null
+++ b/tests/09imsm-create-fail-rebuild.broken
@@ -0,0 +1,5 @@
+always fails
+
+Fails with error:
+
+ **Error**: Array size mismatch - expected 3072, actual 16384
diff --git a/tests/09imsm-overlap.broken b/tests/09imsm-overlap.broken
new file mode 100644
index 000000000000..e7ccab768bea
--- /dev/null
+++ b/tests/09imsm-overlap.broken
@@ -0,0 +1,7 @@
+always fails
+
+Fails with errors:
+
+ **Error**: Offset mismatch - expected 15360, actual 0
+ **Error**: Offset mismatch - expected 15360, actual 0
+ /dev/md/vol3 failed check
diff --git a/tests/10ddf-assemble-missing.broken b/tests/10ddf-assemble-missing.broken
new file mode 100644
index 000000000000..bfd8d103a630
--- /dev/null
+++ b/tests/10ddf-assemble-missing.broken
@@ -0,0 +1,6 @@
+always fails
+
+Fails with errors:
+
+ ERROR: /dev/md/vol0 has unexpected state on /dev/loop10
+ ERROR: unexpected number of online disks on /dev/loop10
diff --git a/tests/10ddf-fail-create-race.broken b/tests/10ddf-fail-create-race.broken
new file mode 100644
index 000000000000..6c0df023fb18
--- /dev/null
+++ b/tests/10ddf-fail-create-race.broken
@@ -0,0 +1,7 @@
+usually fails
+
+Fails about 9 out of 10 times with many errors:
+
+ mdadm: cannot open MISSING: No such file or directory
+ ERROR: non-degraded array found
+ ERROR: disk 0 not marked as failed in meta data
diff --git a/tests/10ddf-fail-two-spares.broken b/tests/10ddf-fail-two-spares.broken
new file mode 100644
index 000000000000..eeea56d989ff
--- /dev/null
+++ b/tests/10ddf-fail-two-spares.broken
@@ -0,0 +1,5 @@
+fails infrequently
+
+Fails roughly 1 in 3 with error:
+
+ ERROR: /dev/md/vol1 should be optimal in meta data
diff --git a/tests/10ddf-incremental-wrong-order.broken b/tests/10ddf-incremental-wrong-order.broken
new file mode 100644
index 000000000000..a5af3bab2ec2
--- /dev/null
+++ b/tests/10ddf-incremental-wrong-order.broken
@@ -0,0 +1,9 @@
+always fails
+
+Fails with errors:
+ ERROR: sha1sum of /dev/md/vol0 has changed
+ ERROR: /dev/md/vol0 has unexpected state on /dev/loop10
+ ERROR: unexpected number of online disks on /dev/loop10
+ ERROR: /dev/md/vol0 has unexpected state on /dev/loop8
+ ERROR: unexpected number of online disks on /dev/loop8
+ ERROR: sha1sum of /dev/md/vol0 has changed
diff --git a/tests/14imsm-r1_2d-grow-r1_3d.broken b/tests/14imsm-r1_2d-grow-r1_3d.broken
new file mode 100644
index 000000000000..4ef1d4069b65
--- /dev/null
+++ b/tests/14imsm-r1_2d-grow-r1_3d.broken
@@ -0,0 +1,5 @@
+always fails
+
+Fails with error:
+
+ mdadm/tests/func.sh: line 325: dvsize/chunk: division by 0 (error token is "chunk")
diff --git a/tests/14imsm-r1_2d-takeover-r0_2d.broken b/tests/14imsm-r1_2d-takeover-r0_2d.broken
new file mode 100644
index 000000000000..89cd4e575362
--- /dev/null
+++ b/tests/14imsm-r1_2d-takeover-r0_2d.broken
@@ -0,0 +1,6 @@
+always fails
+
+Fails with error:
+
+ tests/func.sh: line 325: dvsize/chunk: division by 0 (error token
+ is "chunk")
diff --git a/tests/18imsm-r10_4d-takeover-r0_2d.broken b/tests/18imsm-r10_4d-takeover-r0_2d.broken
new file mode 100644
index 000000000000..a27399f5ed83
--- /dev/null
+++ b/tests/18imsm-r10_4d-takeover-r0_2d.broken
@@ -0,0 +1,5 @@
+fails rarely
+
+Fails about 1 run in 100 with message:
+
+ ERROR: size is wrong for /dev/md/vol0: 2 * 5120 (chunk=128) = 20480, not 0
diff --git a/tests/18imsm-r1_2d-takeover-r0_1d.broken b/tests/18imsm-r1_2d-takeover-r0_1d.broken
new file mode 100644
index 000000000000..aa1982e6acfd
--- /dev/null
+++ b/tests/18imsm-r1_2d-takeover-r0_1d.broken
@@ -0,0 +1,6 @@
+always fails
+
+Fails with error:
+
+ tests/func.sh: line 325: dvsize/chunk: division by 0 (error token
+ is "chunk")
diff --git a/tests/19raid6auto-repair.broken b/tests/19raid6auto-repair.broken
new file mode 100644
index 000000000000..e91a142575e2
--- /dev/null
+++ b/tests/19raid6auto-repair.broken
@@ -0,0 +1,5 @@
+always fails
+
+Fails with:
+
+ "should detect errors"
diff --git a/tests/19raid6repair.broken b/tests/19raid6repair.broken
new file mode 100644
index 000000000000..e91a142575e2
--- /dev/null
+++ b/tests/19raid6repair.broken
@@ -0,0 +1,5 @@
+always fails
+
+Fails with:
+
+ "should detect errors"
--
2.30.2
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH mdadm v2 00/14] Bug fixes and testing improvments
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (13 preceding siblings ...)
2022-06-22 20:25 ` [PATCH mdadm v2 14/14] tests: Add broken files for all broken tests Logan Gunthorpe
@ 2022-07-22 17:00 ` Himanshu Madhani
2022-07-23 6:21 ` Coly Li
2022-08-07 20:35 ` Jes Sorensen
15 siblings, 1 reply; 23+ messages in thread
From: Himanshu Madhani @ 2022-07-22 17:00 UTC (permalink / raw)
To: Logan Gunthorpe
Cc: linux-raid@vger.kernel.org, Jes Sorensen, Song Liu,
Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
Mariusz Tkaczyk, Coly Li, Bruce Dubbs, Stephen Bates,
Martin Oliveira, David Sloan
Jes,
> On Jun 22, 2022, at 1:25 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
>
> Hi,
>
> This series tries to clean up the testing infrastructure to be a bit
> more reliable. It doesn't fix all the broken tests but annotates those
> that I see as broken so testing can continue. V2 includes changes
> requested in the feedback so far.
>
> As such, I've fixed all the kernel panics (in md-next now) and segfaults
> that caused testing to halt regardless of whether --keep-going was
> passed. I've also included some patches posted to the list from Sudhakar
> and Himanshu which fix some more broken tests.
>
> I've also included a patch which adds the --loop option to ./test which
> runs tests for a specified number of iterations (or indefinitely if zero
> is specified). This was very useful for ferreting out tests that failed
> randomly.
>
> The last two patches adds some infrastructure and annotation for known
> broken tests so that they don't stop the processing (even if
> --keep-going is not passed). Tests that are known to be broken can
> optionally be skipped with the --skip-broken or --skip-always-broken
> flags.
>
> With these changes it's possible to run './test --loop=0' for several
> days without stopping.
>
> There are still a number of broken tests which need more work, and there
> may be other issues on other people's systems (kernel configurations,
> etc) but that will have to be left to other developers.
>
> The tests that are still broken for me in one way or another are:
> 01r5integ, 01raid6integ, 04r5swap.broken, 04update-metadata,
> 07autoassemble, 07autodetect, 07changelevelintr, 07changelevels,
> 07reshape5intr, 07revert-grow, 07revert-shrink, 07testreshape5,
> 09imsm-assemble, 09imsm-create-fail-rebuild, 09imsm-overlap,
> 10ddf-assemble-missing, 10ddf-fail-create-race,
> 10ddf-fail-two-spares, 10ddf-incremental-wrong-order,
> 14imsm-r1_2d-grow-r1_3d, 14imsm-r1_2d-takeover-r0_2d,
> 18imsm-r10_4d-takeover-r0_2d, 18imsm-r1_2d-takeover-r0_1d,
> 19raid6auto-repair, 19raid6repair.broken
>
> Details on how they are broken can be found in the last patch.
>
> This series is based on the current kernel.org master (190dc029) and
> a git repo can be found here:
>
> https://github.com/lsgunth/mdadm bugfixes_v2
>
>
> Thanks,
>
> Logan
>
> --
>
> Changes since v1:
> * Rebase onto latest master (190dc029b141c423e), which means
> reworking patch 6 seeing the original patch was already
> reverted
> * Drop mdadm.static from the make target everything-test as well
> as everything (as pointed out by Mariusz)
> * Switch to using close_fd() helper in patch 4 (per Mariusz)
> * Fixed a couple minor typos and whitespace issues from Guoqing
> and Paul
> * Collected Acks from Mariusz
>
> --
>
> Logan Gunthorpe (10):
> Makefile: Don't build static build with everything and everything-test
> DDF: Cleanup validate_geometry_ddf_container()
> DDF: Fix NULL pointer dereference in validate_geometry_ddf()
> mdadm/Grow: Fix use after close bug by closing after fork
> monitor: Avoid segfault when calling NULL get_bad_blocks
> mdadm: Fix mdadm -r remove option regression
> mdadm: Fix optional --write-behind parameter
> mdadm/test: Add a mode to repeat specified tests
> mdadm/test: Mark and ignore broken test failures
> tests: Add broken files for all broken tests
>
> Sudhakar Panneerselvam (4):
> tests/00raid0: add a test that validates raid0 with layout fails for
> 0.9
> tests: fix raid0 tests for 0.90 metadata
> tests/04update-metadata: avoid passing chunk size to raid1
> tests/02lineargrow: clear the superblock at every iteration
>
> Grow.c | 4 +-
> Makefile | 4 +-
> ReadMe.c | 1 +
> mdadm.c | 6 +-
> mdadm.h | 1 +
> monitor.c | 3 +
> super-ddf.c | 94 ++++++++++------------
> test | 71 +++++++++++++---
> tests/00raid0 | 10 +--
> tests/00readonly | 4 +
> tests/01r5integ.broken | 7 ++
> tests/01raid6integ.broken | 7 ++
> tests/02lineargrow | 2 +
> tests/03r0assem | 6 +-
> tests/04r0update | 4 +-
> tests/04r5swap.broken | 7 ++
> tests/04update-metadata | 8 +-
> tests/07autoassemble.broken | 8 ++
> tests/07autodetect.broken | 5 ++
> tests/07changelevelintr.broken | 9 +++
> tests/07changelevels.broken | 9 +++
> tests/07reshape5intr.broken | 45 +++++++++++
> tests/07revert-grow.broken | 31 +++++++
> tests/07revert-shrink.broken | 9 +++
> tests/07testreshape5.broken | 12 +++
> tests/09imsm-assemble.broken | 6 ++
> tests/09imsm-create-fail-rebuild.broken | 5 ++
> tests/09imsm-overlap.broken | 7 ++
> tests/10ddf-assemble-missing.broken | 6 ++
> tests/10ddf-fail-create-race.broken | 7 ++
> tests/10ddf-fail-two-spares.broken | 5 ++
> tests/10ddf-incremental-wrong-order.broken | 9 +++
> tests/14imsm-r1_2d-grow-r1_3d.broken | 5 ++
> tests/14imsm-r1_2d-takeover-r0_2d.broken | 6 ++
> tests/18imsm-r10_4d-takeover-r0_2d.broken | 5 ++
> tests/18imsm-r1_2d-takeover-r0_1d.broken | 6 ++
> tests/19raid6auto-repair.broken | 5 ++
> tests/19raid6repair.broken | 5 ++
> 38 files changed, 361 insertions(+), 83 deletions(-)
> create mode 100644 tests/01r5integ.broken
> create mode 100644 tests/01raid6integ.broken
> create mode 100644 tests/04r5swap.broken
> create mode 100644 tests/07autoassemble.broken
> create mode 100644 tests/07autodetect.broken
> create mode 100644 tests/07changelevelintr.broken
> create mode 100644 tests/07changelevels.broken
> create mode 100644 tests/07reshape5intr.broken
> create mode 100644 tests/07revert-grow.broken
> create mode 100644 tests/07revert-shrink.broken
> create mode 100644 tests/07testreshape5.broken
> create mode 100644 tests/09imsm-assemble.broken
> create mode 100644 tests/09imsm-create-fail-rebuild.broken
> create mode 100644 tests/09imsm-overlap.broken
> create mode 100644 tests/10ddf-assemble-missing.broken
> create mode 100644 tests/10ddf-fail-create-race.broken
> create mode 100644 tests/10ddf-fail-two-spares.broken
> create mode 100644 tests/10ddf-incremental-wrong-order.broken
> create mode 100644 tests/14imsm-r1_2d-grow-r1_3d.broken
> create mode 100644 tests/14imsm-r1_2d-takeover-r0_2d.broken
> create mode 100644 tests/18imsm-r10_4d-takeover-r0_2d.broken
> create mode 100644 tests/18imsm-r1_2d-takeover-r0_1d.broken
> create mode 100644 tests/19raid6auto-repair.broken
> create mode 100644 tests/19raid6repair.broken
>
>
> base-commit: 190dc029b141c423e724566cbed5d5afbb10b05a
> --
> 2.30.2
I have not seen any updates or review comments on this series. Any plan on merging this series?
I have been using this test series for my developer testing and this has a very helpful
testing framework update. This update improves baseline testing and predictive failure coverage.
I find it very useful to work on improving the overall test infrastructure.
You can add my R-B for the series while merging.
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
--
Himanshu Madhani Oracle Linux Engineering
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH mdadm v2 00/14] Bug fixes and testing improvments
2022-07-22 17:00 ` [PATCH mdadm v2 00/14] Bug fixes and testing improvments Himanshu Madhani
@ 2022-07-23 6:21 ` Coly Li
2022-08-08 20:22 ` Himanshu Madhani
0 siblings, 1 reply; 23+ messages in thread
From: Coly Li @ 2022-07-23 6:21 UTC (permalink / raw)
To: Himanshu Madhani
Cc: Logan Gunthorpe, linux-raid@vger.kernel.org, Jes Sorensen,
Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Mariusz Tkaczyk, Bruce Dubbs, Stephen Bates,
Martin Oliveira, David Sloan
> 2022年7月23日 01:00,Himanshu Madhani <himanshu.madhani@oracle.com> 写道:
>
> Jes,
>
>> On Jun 22, 2022, at 1:25 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
>>
>> Hi,
>>
>> This series tries to clean up the testing infrastructure to be a bit
>> more reliable. It doesn't fix all the broken tests but annotates those
>> that I see as broken so testing can continue. V2 includes changes
>> requested in the feedback so far.
>>
>> As such, I've fixed all the kernel panics (in md-next now) and segfaults
>> that caused testing to halt regardless of whether --keep-going was
>> passed. I've also included some patches posted to the list from Sudhakar
>> and Himanshu which fix some more broken tests.
>>
>> I've also included a patch which adds the --loop option to ./test which
>> runs tests for a specified number of iterations (or indefinitely if zero
>> is specified). This was very useful for ferreting out tests that failed
>> randomly.
>>
>> The last two patches adds some infrastructure and annotation for known
>> broken tests so that they don't stop the processing (even if
>> --keep-going is not passed). Tests that are known to be broken can
>> optionally be skipped with the --skip-broken or --skip-always-broken
>> flags.
>>
>> With these changes it's possible to run './test --loop=0' for several
>> days without stopping.
>>
>> There are still a number of broken tests which need more work, and there
>> may be other issues on other people's systems (kernel configurations,
>> etc) but that will have to be left to other developers.
>>
>> The tests that are still broken for me in one way or another are:
>> 01r5integ, 01raid6integ, 04r5swap.broken, 04update-metadata,
>> 07autoassemble, 07autodetect, 07changelevelintr, 07changelevels,
>> 07reshape5intr, 07revert-grow, 07revert-shrink, 07testreshape5,
>> 09imsm-assemble, 09imsm-create-fail-rebuild, 09imsm-overlap,
>> 10ddf-assemble-missing, 10ddf-fail-create-race,
>> 10ddf-fail-two-spares, 10ddf-incremental-wrong-order,
>> 14imsm-r1_2d-grow-r1_3d, 14imsm-r1_2d-takeover-r0_2d,
>> 18imsm-r10_4d-takeover-r0_2d, 18imsm-r1_2d-takeover-r0_1d,
>> 19raid6auto-repair, 19raid6repair.broken
>>
>> Details on how they are broken can be found in the last patch.
>>
>> This series is based on the current kernel.org master (190dc029) and
>> a git repo can be found here:
>>
>> https://github.com/lsgunth/mdadm bugfixes_v2
>>
>>
>> Thanks,
>>
>> Logan
>>
>> --
>>
>> Changes since v1:
>> * Rebase onto latest master (190dc029b141c423e), which means
>> reworking patch 6 seeing the original patch was already
>> reverted
>> * Drop mdadm.static from the make target everything-test as well
>> as everything (as pointed out by Mariusz)
>> * Switch to using close_fd() helper in patch 4 (per Mariusz)
>> * Fixed a couple minor typos and whitespace issues from Guoqing
>> and Paul
>> * Collected Acks from Mariusz
>>
>> --
>>
>> Logan Gunthorpe (10):
>> Makefile: Don't build static build with everything and everything-test
>> DDF: Cleanup validate_geometry_ddf_container()
>> DDF: Fix NULL pointer dereference in validate_geometry_ddf()
>> mdadm/Grow: Fix use after close bug by closing after fork
>> monitor: Avoid segfault when calling NULL get_bad_blocks
>> mdadm: Fix mdadm -r remove option regression
>> mdadm: Fix optional --write-behind parameter
>> mdadm/test: Add a mode to repeat specified tests
>> mdadm/test: Mark and ignore broken test failures
>> tests: Add broken files for all broken tests
>>
>> Sudhakar Panneerselvam (4):
>> tests/00raid0: add a test that validates raid0 with layout fails for
>> 0.9
>> tests: fix raid0 tests for 0.90 metadata
>> tests/04update-metadata: avoid passing chunk size to raid1
>> tests/02lineargrow: clear the superblock at every iteration
>>
>> Grow.c | 4 +-
>> Makefile | 4 +-
>> ReadMe.c | 1 +
>> mdadm.c | 6 +-
>> mdadm.h | 1 +
>> monitor.c | 3 +
>> super-ddf.c | 94 ++++++++++------------
>> test | 71 +++++++++++++---
>> tests/00raid0 | 10 +--
>> tests/00readonly | 4 +
>> tests/01r5integ.broken | 7 ++
>> tests/01raid6integ.broken | 7 ++
>> tests/02lineargrow | 2 +
>> tests/03r0assem | 6 +-
>> tests/04r0update | 4 +-
>> tests/04r5swap.broken | 7 ++
>> tests/04update-metadata | 8 +-
>> tests/07autoassemble.broken | 8 ++
>> tests/07autodetect.broken | 5 ++
>> tests/07changelevelintr.broken | 9 +++
>> tests/07changelevels.broken | 9 +++
>> tests/07reshape5intr.broken | 45 +++++++++++
>> tests/07revert-grow.broken | 31 +++++++
>> tests/07revert-shrink.broken | 9 +++
>> tests/07testreshape5.broken | 12 +++
>> tests/09imsm-assemble.broken | 6 ++
>> tests/09imsm-create-fail-rebuild.broken | 5 ++
>> tests/09imsm-overlap.broken | 7 ++
>> tests/10ddf-assemble-missing.broken | 6 ++
>> tests/10ddf-fail-create-race.broken | 7 ++
>> tests/10ddf-fail-two-spares.broken | 5 ++
>> tests/10ddf-incremental-wrong-order.broken | 9 +++
>> tests/14imsm-r1_2d-grow-r1_3d.broken | 5 ++
>> tests/14imsm-r1_2d-takeover-r0_2d.broken | 6 ++
>> tests/18imsm-r10_4d-takeover-r0_2d.broken | 5 ++
>> tests/18imsm-r1_2d-takeover-r0_1d.broken | 6 ++
>> tests/19raid6auto-repair.broken | 5 ++
>> tests/19raid6repair.broken | 5 ++
>> 38 files changed, 361 insertions(+), 83 deletions(-)
>> create mode 100644 tests/01r5integ.broken
>> create mode 100644 tests/01raid6integ.broken
>> create mode 100644 tests/04r5swap.broken
>> create mode 100644 tests/07autoassemble.broken
>> create mode 100644 tests/07autodetect.broken
>> create mode 100644 tests/07changelevelintr.broken
>> create mode 100644 tests/07changelevels.broken
>> create mode 100644 tests/07reshape5intr.broken
>> create mode 100644 tests/07revert-grow.broken
>> create mode 100644 tests/07revert-shrink.broken
>> create mode 100644 tests/07testreshape5.broken
>> create mode 100644 tests/09imsm-assemble.broken
>> create mode 100644 tests/09imsm-create-fail-rebuild.broken
>> create mode 100644 tests/09imsm-overlap.broken
>> create mode 100644 tests/10ddf-assemble-missing.broken
>> create mode 100644 tests/10ddf-fail-create-race.broken
>> create mode 100644 tests/10ddf-fail-two-spares.broken
>> create mode 100644 tests/10ddf-incremental-wrong-order.broken
>> create mode 100644 tests/14imsm-r1_2d-grow-r1_3d.broken
>> create mode 100644 tests/14imsm-r1_2d-takeover-r0_2d.broken
>> create mode 100644 tests/18imsm-r10_4d-takeover-r0_2d.broken
>> create mode 100644 tests/18imsm-r1_2d-takeover-r0_1d.broken
>> create mode 100644 tests/19raid6auto-repair.broken
>> create mode 100644 tests/19raid6repair.broken
>>
>>
>> base-commit: 190dc029b141c423e724566cbed5d5afbb10b05a
>> --
>> 2.30.2
>
> I have not seen any updates or review comments on this series. Any plan on merging this series?
>
> I have been using this test series for my developer testing and this has a very helpful
> testing framework update. This update improves baseline testing and predictive failure coverage.
> I find it very useful to work on improving the overall test infrastructure.
>
> You can add my R-B for the series while merging.
>
> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
I just finished to go through all these fixes recently. After the rested patches (around 4~5) in my review-queue finished, I will submit them all to Jes for the next step to handle, with your Rviewed-by tag.
Coly Li
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH mdadm v2 00/14] Bug fixes and testing improvments
2022-07-23 6:21 ` Coly Li
@ 2022-08-08 20:22 ` Himanshu Madhani
0 siblings, 0 replies; 23+ messages in thread
From: Himanshu Madhani @ 2022-08-08 20:22 UTC (permalink / raw)
To: Coly Li
Cc: Logan Gunthorpe, linux-raid@vger.kernel.org, Jes Sorensen,
Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Mariusz Tkaczyk, Bruce Dubbs, Stephen Bates,
Martin Oliveira, David Sloan
Hi Coly,
> On Jul 22, 2022, at 11:21 PM, Coly Li <colyli@suse.de> wrote:
>
> I just finished to go through all these fixes recently. After the rested patches (around 4~5) in my review-queue finished, I will submit them all to Jes for the next step to handle, with your Rviewed-by tag.
Thanks for the update and help with reviews.
--
Himanshu Madhani Oracle Linux Engineering
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH mdadm v2 00/14] Bug fixes and testing improvments
2022-06-22 20:25 [PATCH mdadm v2 00/14] Bug fixes and testing improvments Logan Gunthorpe
` (14 preceding siblings ...)
2022-07-22 17:00 ` [PATCH mdadm v2 00/14] Bug fixes and testing improvments Himanshu Madhani
@ 2022-08-07 20:35 ` Jes Sorensen
2022-08-08 15:46 ` Logan Gunthorpe
15 siblings, 1 reply; 23+ messages in thread
From: Jes Sorensen @ 2022-08-07 20:35 UTC (permalink / raw)
To: Logan Gunthorpe, linux-raid
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan
On 6/22/22 16:25, Logan Gunthorpe wrote:
> Hi,
>
> This series tries to clean up the testing infrastructure to be a bit
> more reliable. It doesn't fix all the broken tests but annotates those
> that I see as broken so testing can continue. V2 includes changes
> requested in the feedback so far.
>
> As such, I've fixed all the kernel panics (in md-next now) and segfaults
> that caused testing to halt regardless of whether --keep-going was
> passed. I've also included some patches posted to the list from Sudhakar
> and Himanshu which fix some more broken tests.
>
> I've also included a patch which adds the --loop option to ./test which
> runs tests for a specified number of iterations (or indefinitely if zero
> is specified). This was very useful for ferreting out tests that failed
> randomly.
>
> The last two patches adds some infrastructure and annotation for known
> broken tests so that they don't stop the processing (even if
> --keep-going is not passed). Tests that are known to be broken can
> optionally be skipped with the --skip-broken or --skip-always-broken
> flags.
>
> With these changes it's possible to run './test --loop=0' for several
> days without stopping.
>
> There are still a number of broken tests which need more work, and there
> may be other issues on other people's systems (kernel configurations,
> etc) but that will have to be left to other developers.
>
> The tests that are still broken for me in one way or another are:
> 01r5integ, 01raid6integ, 04r5swap.broken, 04update-metadata,
> 07autoassemble, 07autodetect, 07changelevelintr, 07changelevels,
> 07reshape5intr, 07revert-grow, 07revert-shrink, 07testreshape5,
> 09imsm-assemble, 09imsm-create-fail-rebuild, 09imsm-overlap,
> 10ddf-assemble-missing, 10ddf-fail-create-race,
> 10ddf-fail-two-spares, 10ddf-incremental-wrong-order,
> 14imsm-r1_2d-grow-r1_3d, 14imsm-r1_2d-takeover-r0_2d,
> 18imsm-r10_4d-takeover-r0_2d, 18imsm-r1_2d-takeover-r0_1d,
> 19raid6auto-repair, 19raid6repair.broken
>
> Details on how they are broken can be found in the last patch.
>
> This series is based on the current kernel.org master (190dc029) and
> a git repo can be found here:
>
> https://github.com/lsgunth/mdadm bugfixes_v2
Applied,
I am traveling and brought a new laptop, without the SSH key I need to
push, so I'll push things next week when I get home.
Thanks,
Jes
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH mdadm v2 00/14] Bug fixes and testing improvments
2022-08-07 20:35 ` Jes Sorensen
@ 2022-08-08 15:46 ` Logan Gunthorpe
0 siblings, 0 replies; 23+ messages in thread
From: Logan Gunthorpe @ 2022-08-08 15:46 UTC (permalink / raw)
To: Jes Sorensen, linux-raid
Cc: Song Liu, Christoph Hellwig, Donald Buczek, Guoqing Jiang,
Xiao Ni, Himanshu Madhani, Mariusz Tkaczyk, Coly Li, Bruce Dubbs,
Stephen Bates, Martin Oliveira, David Sloan
On 2022-08-07 14:35, Jes Sorensen wrote:
> Applied,
>
> I am traveling and brought a new laptop, without the SSH key I need to
> push, so I'll push things next week when I get home.
Thanks!
Logan
^ permalink raw reply [flat|nested] 23+ messages in thread