* [PATCH 1/2] FIX: mdmon doesn't start
@ 2011-11-03 16:55 Adam Kwolek
2011-11-03 16:55 ` [PATCH 2/2] FIX: Do not continue container reshape when mdmon is absent Adam Kwolek
2011-11-07 0:46 ` [PATCH 1/2] FIX: mdmon doesn't start NeilBrown
0 siblings, 2 replies; 4+ messages in thread
From: Adam Kwolek @ 2011-11-03 16:55 UTC (permalink / raw)
To: neilb; +Cc: linux-raid, ed.ciechanowski, marcin.labun, dan.j.williams
When array is not clean dismounted directory /dev/.mdadm is not cleaned up.
On array re-assembly read pid is not valid and it is not possible
to connect to monitor. This causes mdmon to exit and array remains
not monitored.
Problem is introduced by fix:
mdmon(): Error out if failing to connect to victim monitor
819c158866f466075a1c719f0dc496deb2fb3814
This is critical for container reshape when mdmon is should finish reshape.
when reshape is not finished, array is reshaped again by mdadm.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
---
mdmon.c | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/mdmon.c b/mdmon.c
index bdcda0e..5ac7cd6 100644
--- a/mdmon.c
+++ b/mdmon.c
@@ -458,11 +458,16 @@ static int mdmon(char *devname, int devnum, int must_fork, int takeover)
victim = mdmon_pid(container->devnum);
if (victim >= 0) {
- victim_sock = connect_monitor(container->devname);
- if (victim_sock < 0) {
- fprintf(stderr, "mdmon: %s unable to connect monitor\n",
- container->devname);
- exit(3);
+ /* It is possible that mdmon that wrote pid file was killed.
+ * check if read pid is valid/mdmon is running
+ */
+ if (mdmon_running(victim)) {
+ victim_sock = connect_monitor(container->devname);
+ if (victim_sock < 0) {
+ fprintf(stderr, "mdmon: %s unable to connect "
+ "monitor\n", container->devname);
+ exit(3);
+ }
}
}
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH 2/2] FIX: Do not continue container reshape when mdmon is absent
2011-11-03 16:55 [PATCH 1/2] FIX: mdmon doesn't start Adam Kwolek
@ 2011-11-03 16:55 ` Adam Kwolek
2011-11-07 0:47 ` NeilBrown
2011-11-07 0:46 ` [PATCH 1/2] FIX: mdmon doesn't start NeilBrown
1 sibling, 1 reply; 4+ messages in thread
From: Adam Kwolek @ 2011-11-03 16:55 UTC (permalink / raw)
To: neilb; +Cc: linux-raid, ed.ciechanowski, marcin.labun, dan.j.williams
When mdmon is absent metadata is not updated, and container_reshape()
can fall in to endless loop. This can cause user data corruption.
In case when mdmon is absent do not continue container reshape process.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
---
Grow.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/Grow.c b/Grow.c
index e7fd7c4..184a973 100644
--- a/Grow.c
+++ b/Grow.c
@@ -2562,6 +2562,12 @@ int reshape_container(char *container, char *devname,
restart = 0;
if (rv)
break;
+ rv = !mdmon_running(devname2devnum(container));
+ if (rv) {
+ printf(Name ": Mdmon is not found. "
+ "Cannot continue container reshape.\n");
+ break;
+ }
}
if (!rv)
unfreeze(st);
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH 2/2] FIX: Do not continue container reshape when mdmon is absent
2011-11-03 16:55 ` [PATCH 2/2] FIX: Do not continue container reshape when mdmon is absent Adam Kwolek
@ 2011-11-07 0:47 ` NeilBrown
0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2011-11-07 0:47 UTC (permalink / raw)
To: Adam Kwolek; +Cc: linux-raid, ed.ciechanowski, marcin.labun, dan.j.williams
[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]
On Thu, 03 Nov 2011 17:55:41 +0100 Adam Kwolek <adam.kwolek@intel.com> wrote:
> When mdmon is absent metadata is not updated, and container_reshape()
> can fall in to endless loop. This can cause user data corruption.
>
> In case when mdmon is absent do not continue container reshape process.
>
> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
> ---
>
> Grow.c | 6 ++++++
> 1 files changed, 6 insertions(+), 0 deletions(-)
>
> diff --git a/Grow.c b/Grow.c
> index e7fd7c4..184a973 100644
> --- a/Grow.c
> +++ b/Grow.c
> @@ -2562,6 +2562,12 @@ int reshape_container(char *container, char *devname,
> restart = 0;
> if (rv)
> break;
> + rv = !mdmon_running(devname2devnum(container));
> + if (rv) {
> + printf(Name ": Mdmon is not found. "
> + "Cannot continue container reshape.\n");
> + break;
> + }
> }
> if (!rv)
> unfreeze(st);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Applied- thanks.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] FIX: mdmon doesn't start
2011-11-03 16:55 [PATCH 1/2] FIX: mdmon doesn't start Adam Kwolek
2011-11-03 16:55 ` [PATCH 2/2] FIX: Do not continue container reshape when mdmon is absent Adam Kwolek
@ 2011-11-07 0:46 ` NeilBrown
1 sibling, 0 replies; 4+ messages in thread
From: NeilBrown @ 2011-11-07 0:46 UTC (permalink / raw)
To: Adam Kwolek
Cc: linux-raid, ed.ciechanowski, marcin.labun, dan.j.williams,
Jes.Sorensen
[-- Attachment #1: Type: text/plain, Size: 1954 bytes --]
On Thu, 03 Nov 2011 17:55:33 +0100 Adam Kwolek <adam.kwolek@intel.com> wrote:
> When array is not clean dismounted directory /dev/.mdadm is not cleaned up.
> On array re-assembly read pid is not valid and it is not possible
> to connect to monitor. This causes mdmon to exit and array remains
> not monitored.
> Problem is introduced by fix:
> mdmon(): Error out if failing to connect to victim monitor
> 819c158866f466075a1c719f0dc496deb2fb3814
>
> This is critical for container reshape when mdmon is should finish reshape.
> when reshape is not finished, array is reshaped again by mdadm.
>
> Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
> ---
>
> mdmon.c | 15 ++++++++++-----
> 1 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/mdmon.c b/mdmon.c
> index bdcda0e..5ac7cd6 100644
> --- a/mdmon.c
> +++ b/mdmon.c
> @@ -458,11 +458,16 @@ static int mdmon(char *devname, int devnum, int must_fork, int takeover)
>
> victim = mdmon_pid(container->devnum);
> if (victim >= 0) {
> - victim_sock = connect_monitor(container->devname);
> - if (victim_sock < 0) {
> - fprintf(stderr, "mdmon: %s unable to connect monitor\n",
> - container->devname);
> - exit(3);
> + /* It is possible that mdmon that wrote pid file was killed.
> + * check if read pid is valid/mdmon is running
> + */
> + if (mdmon_running(victim)) {
> + victim_sock = connect_monitor(container->devname);
> + if (victim_sock < 0) {
> + fprintf(stderr, "mdmon: %s unable to connect "
> + "monitor\n", container->devname);
> + exit(3);
> + }
> }
> }
>
Thanks for the patch.
I decided to revert the patch that originally caused the problem instead - it
really isn't needed.
I then added a patch to make sure we never use victim_sock when it is -1.
The places were we might have used it we not dangerous at all, but it is
cleaner to check.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-11-07 0:47 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-03 16:55 [PATCH 1/2] FIX: mdmon doesn't start Adam Kwolek
2011-11-03 16:55 ` [PATCH 2/2] FIX: Do not continue container reshape when mdmon is absent Adam Kwolek
2011-11-07 0:47 ` NeilBrown
2011-11-07 0:46 ` [PATCH 1/2] FIX: mdmon doesn't start NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).