* [PATCH mdadm 0/2] Bug fixes for --write-zeros option
@ 2024-06-04 16:38 Logan Gunthorpe
2024-06-04 16:38 ` [PATCH mdadm 1/2] mdadm: Fix hang race condition in wait_for_zero_forks() Logan Gunthorpe
` (3 more replies)
0 siblings, 4 replies; 6+ messages in thread
From: Logan Gunthorpe @ 2024-06-04 16:38 UTC (permalink / raw)
To: linux-raid, Jes Sorensen, Xiao Ni
Cc: Guoqing Jiang, Mariusz Tkaczyk, Logan Gunthorpe
Hi,
Xiao noticed that the write-zeros tests failed randomly, especially
with small disks. We tracked this down to an issue with signalfd which
coallesced SIGCHLD signals into one. This is fixed by checking the
status of all children after every SIGCHLD.
While we were at it, we noticed a potential reace with SIGCHLD coming
in before the signal was blocked in wait_for_zero_forks() and fix this
by moving the blocking before the child creation.
Thanks,
Logan
--
Logan Gunthorpe (2):
mdadm: Fix hang race condition in wait_for_zero_forks()
mdadm: Block SIGCHLD processes before starting children
Create.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)
base-commit: 46f19270265fe54cda1c728cb156b755273b4ab6
--
2.39.2
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH mdadm 1/2] mdadm: Fix hang race condition in wait_for_zero_forks()
2024-06-04 16:38 [PATCH mdadm 0/2] Bug fixes for --write-zeros option Logan Gunthorpe
@ 2024-06-04 16:38 ` Logan Gunthorpe
2024-06-04 21:34 ` Paul Menzel
2024-06-04 16:38 ` [PATCH mdadm 2/2] mdadm: Block SIGCHLD processes before starting children Logan Gunthorpe
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Logan Gunthorpe @ 2024-06-04 16:38 UTC (permalink / raw)
To: linux-raid, Jes Sorensen, Xiao Ni
Cc: Guoqing Jiang, Mariusz Tkaczyk, Logan Gunthorpe
Running a create operation with --write-zeros can randomly hang
forever waiting for child processes. This happens roughly on in
ten runs with when running with small (20MB) loop devices.
The bug is caused by the fact that signals can be coallesced into
one if they are not read by signalfd quick enough. So if two children
finish at exactly the same time, only one SIGCHLD will be received
by the parent.
To fix this, wait on all processes with WNOHANG every time a SIGCHLD
is recieved and exit when all processes have been waited on.
Reported-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
Create.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/Create.c b/Create.c
index d033eb68f30c..4f992a22b7c9 100644
--- a/Create.c
+++ b/Create.c
@@ -178,6 +178,7 @@ static int wait_for_zero_forks(int *zero_pids, int count)
bool interrupted = false;
sigset_t sigset;
ssize_t s;
+ pid_t pid;
for (i = 0; i < count; i++)
if (zero_pids[i])
@@ -196,7 +197,7 @@ static int wait_for_zero_forks(int *zero_pids, int count)
return 1;
}
- while (1) {
+ while (wait_count) {
s = read(sfd, &fdsi, sizeof(fdsi));
if (s != sizeof(fdsi)) {
pr_err("Invalid signalfd read: %s\n", strerror(errno));
@@ -209,23 +210,24 @@ static int wait_for_zero_forks(int *zero_pids, int count)
pr_info("Interrupting zeroing processes, please wait...\n");
interrupted = true;
} else if (fdsi.ssi_signo == SIGCHLD) {
- if (!--wait_count)
- break;
+ for (i = 0; i < count; i++) {
+ if (!zero_pids[i])
+ continue;
+
+ pid = waitpid(zero_pids[i], &wstatus, WNOHANG);
+ if (pid <= 0)
+ continue;
+
+ zero_pids[i] = 0;
+ if (!WIFEXITED(wstatus) || WEXITSTATUS(wstatus))
+ ret = 1;
+ wait_count--;
+ }
}
}
close(sfd);
- for (i = 0; i < count; i++) {
- if (!zero_pids[i])
- continue;
-
- waitpid(zero_pids[i], &wstatus, 0);
- zero_pids[i] = 0;
- if (!WIFEXITED(wstatus) || WEXITSTATUS(wstatus))
- ret = 1;
- }
-
if (interrupted) {
pr_err("zeroing interrupted!\n");
return 1;
--
2.39.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH mdadm 2/2] mdadm: Block SIGCHLD processes before starting children
2024-06-04 16:38 [PATCH mdadm 0/2] Bug fixes for --write-zeros option Logan Gunthorpe
2024-06-04 16:38 ` [PATCH mdadm 1/2] mdadm: Fix hang race condition in wait_for_zero_forks() Logan Gunthorpe
@ 2024-06-04 16:38 ` Logan Gunthorpe
2024-06-12 10:15 ` [PATCH mdadm 0/2] Bug fixes for --write-zeros option Mariusz Tkaczyk
2024-06-13 13:24 ` Mariusz Tkaczyk
3 siblings, 0 replies; 6+ messages in thread
From: Logan Gunthorpe @ 2024-06-04 16:38 UTC (permalink / raw)
To: linux-raid, Jes Sorensen, Xiao Ni
Cc: Guoqing Jiang, Mariusz Tkaczyk, Logan Gunthorpe
There is a small race condition noticed during code review, but
never actully hit in practice, with the write_zero feature.
If a write zeros fork finishes quickly before wait_for_zero_forks()
gets called, then the SIGCHLD will be delivered before the signalfd
is setup.
While this is only theoretical, fix this by blocking the SIGCHLD
signal before forking any children.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
Create.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/Create.c b/Create.c
index 4f992a22b7c9..bd4875e4a408 100644
--- a/Create.c
+++ b/Create.c
@@ -401,6 +401,7 @@ static int add_disks(int mdfd, struct mdinfo *info, struct shape *s,
*/
sigemptyset(&sigset);
sigaddset(&sigset, SIGINT);
+ sigaddset(&sigset, SIGCHLD);
sigprocmask(SIG_BLOCK, &sigset, &orig_sigset);
memset(zero_pids, 0, sizeof(zero_pids));
--
2.39.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH mdadm 1/2] mdadm: Fix hang race condition in wait_for_zero_forks()
2024-06-04 16:38 ` [PATCH mdadm 1/2] mdadm: Fix hang race condition in wait_for_zero_forks() Logan Gunthorpe
@ 2024-06-04 21:34 ` Paul Menzel
0 siblings, 0 replies; 6+ messages in thread
From: Paul Menzel @ 2024-06-04 21:34 UTC (permalink / raw)
To: Logan Gunthorpe
Cc: linux-raid, Jes Sorensen, Xiao Ni, Guoqing Jiang, Mariusz Tkaczyk
Dear Logan,
Thank you for the patch.
Am 04.06.24 um 18:38 schrieb Logan Gunthorpe:
> Running a create operation with --write-zeros can randomly hang
> forever waiting for child processes. This happens roughly on in
> ten runs with when running with small (20MB) loop devices.
>
> The bug is caused by the fact that signals can be coallesced into
> one if they are not read by signalfd quick enough. So if two children
> finish at exactly the same time, only one SIGCHLD will be received
> by the parent.
>
> To fix this, wait on all processes with WNOHANG every time a SIGCHLD
> is recieved and exit when all processes have been waited on.
received
> Reported-by: Xiao Ni <xni@redhat.com>
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> ---
> Create.c | 28 +++++++++++++++-------------
> 1 file changed, 15 insertions(+), 13 deletions(-)
>
> diff --git a/Create.c b/Create.c
> index d033eb68f30c..4f992a22b7c9 100644
> --- a/Create.c
> +++ b/Create.c
> @@ -178,6 +178,7 @@ static int wait_for_zero_forks(int *zero_pids, int count)
> bool interrupted = false;
> sigset_t sigset;
> ssize_t s;
> + pid_t pid;
>
> for (i = 0; i < count; i++)
> if (zero_pids[i])
> @@ -196,7 +197,7 @@ static int wait_for_zero_forks(int *zero_pids, int count)
> return 1;
> }
>
> - while (1) {
> + while (wait_count) {
> s = read(sfd, &fdsi, sizeof(fdsi));
> if (s != sizeof(fdsi)) {
> pr_err("Invalid signalfd read: %s\n", strerror(errno));
> @@ -209,23 +210,24 @@ static int wait_for_zero_forks(int *zero_pids, int count)
> pr_info("Interrupting zeroing processes, please wait...\n");
> interrupted = true;
> } else if (fdsi.ssi_signo == SIGCHLD) {
> - if (!--wait_count)
> - break;
> + for (i = 0; i < count; i++) {
> + if (!zero_pids[i])
> + continue;
> +
> + pid = waitpid(zero_pids[i], &wstatus, WNOHANG);
> + if (pid <= 0)
> + continue;
> +
> + zero_pids[i] = 0;
> + if (!WIFEXITED(wstatus) || WEXITSTATUS(wstatus))
> + ret = 1;
> + wait_count--;
> + }
> }
> }
>
> close(sfd);
>
> - for (i = 0; i < count; i++) {
> - if (!zero_pids[i])
> - continue;
> -
> - waitpid(zero_pids[i], &wstatus, 0);
> - zero_pids[i] = 0;
> - if (!WIFEXITED(wstatus) || WEXITSTATUS(wstatus))
> - ret = 1;
> - }
> -
> if (interrupted) {
> pr_err("zeroing interrupted!\n");
> return 1;
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Kind regards,
Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH mdadm 0/2] Bug fixes for --write-zeros option
2024-06-04 16:38 [PATCH mdadm 0/2] Bug fixes for --write-zeros option Logan Gunthorpe
2024-06-04 16:38 ` [PATCH mdadm 1/2] mdadm: Fix hang race condition in wait_for_zero_forks() Logan Gunthorpe
2024-06-04 16:38 ` [PATCH mdadm 2/2] mdadm: Block SIGCHLD processes before starting children Logan Gunthorpe
@ 2024-06-12 10:15 ` Mariusz Tkaczyk
2024-06-13 13:24 ` Mariusz Tkaczyk
3 siblings, 0 replies; 6+ messages in thread
From: Mariusz Tkaczyk @ 2024-06-12 10:15 UTC (permalink / raw)
To: Logan Gunthorpe; +Cc: linux-raid, Jes Sorensen, Xiao Ni, Guoqing Jiang
On Tue, 4 Jun 2024 10:38:35 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:
> Hi,
>
> Xiao noticed that the write-zeros tests failed randomly, especially
> with small disks. We tracked this down to an issue with signalfd which
> coallesced SIGCHLD signals into one. This is fixed by checking the
> status of all children after every SIGCHLD.
>
> While we were at it, we noticed a potential reace with SIGCHLD coming
> in before the signal was blocked in wait_for_zero_forks() and fix this
> by moving the blocking before the child creation.
>
> Thanks,
>
> Logan
>
> --
Hello Logan,
Thanks for fixes. LGTM.
I will fix typo when merging, no need to sent v2.
I have proxy issue, I have to solve it first.
Thanks,
Mariusz
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH mdadm 0/2] Bug fixes for --write-zeros option
2024-06-04 16:38 [PATCH mdadm 0/2] Bug fixes for --write-zeros option Logan Gunthorpe
` (2 preceding siblings ...)
2024-06-12 10:15 ` [PATCH mdadm 0/2] Bug fixes for --write-zeros option Mariusz Tkaczyk
@ 2024-06-13 13:24 ` Mariusz Tkaczyk
3 siblings, 0 replies; 6+ messages in thread
From: Mariusz Tkaczyk @ 2024-06-13 13:24 UTC (permalink / raw)
To: Logan Gunthorpe; +Cc: linux-raid, Jes Sorensen, Xiao Ni, Guoqing Jiang
On Tue, 4 Jun 2024 10:38:35 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:
> Hi,
>
> Xiao noticed that the write-zeros tests failed randomly, especially
> with small disks. We tracked this down to an issue with signalfd which
> coallesced SIGCHLD signals into one. This is fixed by checking the
> status of all children after every SIGCHLD.
>
> While we were at it, we noticed a potential reace with SIGCHLD coming
> in before the signal was blocked in wait_for_zero_forks() and fix this
> by moving the blocking before the child creation.
>
> Thanks,
>
> Logan
>
> --
Applied!
Thanks,
Mariusz
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-13 13:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-04 16:38 [PATCH mdadm 0/2] Bug fixes for --write-zeros option Logan Gunthorpe
2024-06-04 16:38 ` [PATCH mdadm 1/2] mdadm: Fix hang race condition in wait_for_zero_forks() Logan Gunthorpe
2024-06-04 21:34 ` Paul Menzel
2024-06-04 16:38 ` [PATCH mdadm 2/2] mdadm: Block SIGCHLD processes before starting children Logan Gunthorpe
2024-06-12 10:15 ` [PATCH mdadm 0/2] Bug fixes for --write-zeros option Mariusz Tkaczyk
2024-06-13 13:24 ` Mariusz Tkaczyk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).