linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Fix mdadm vs udev race in Incremental and Assemble
@ 2011-10-20 10:06 Jes.Sorensen
  2011-10-20 10:06 ` [PATCH 1/3] Remove race for starting container devices Jes.Sorensen
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jes.Sorensen @ 2011-10-20 10:06 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, lukasz.dorau, adam.kwolek, dledford

From: Jes Sorensen <Jes.Sorensen@redhat.com>

Hi,

This patch set fixes the problem with mdadm racing udev during during
incremental or regular assembly of IMSM raids. This set fixes a bug in
the initial patch I posted earlier, and adds the locking to Assemble()
which I didn't have in the first round.

There are three patches to the set:
1) Hold the lock while running Incremental_container() to avoid udev
   (or someone else like a boot script) kicking off an additional
   mdadm instance to try and assemble the container in parallel.
2) Don't send udev a 'change' event to handle incremental assembly
   since we are about to do it ourselves anyway.
3) Hold the lock during Assemble() again to avoid udev kicking in
   behind our backs.

With these patches in place, I can no longer reproduce the case I was
seeing on Fedora 16 Beta where an array would not come up correctly.

Patches are on top of Neil's current git repository.

Comments?

Cheers,
Jes


Jes Sorensen (3):
  Remove race for starting container devices.
  Don't tell sysfs to launch the container as we are doing it ourselves
  Hold the map lock while performing Assemble to avoid races with udev

 Incremental.c |   27 ++++++++++++++-------------
 mdadm.c       |    6 ++++++
 2 files changed, 20 insertions(+), 13 deletions(-)

-- 
1.7.6.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] Remove race for starting container devices.
  2011-10-20 10:06 [PATCH 0/3] Fix mdadm vs udev race in Incremental and Assemble Jes.Sorensen
@ 2011-10-20 10:06 ` Jes.Sorensen
  2011-10-20 10:06 ` [PATCH 2/3] Don't tell sysfs to launch the container as we are doing it ourselves Jes.Sorensen
  2011-10-20 10:06 ` [PATCH 3/3] Hold the map lock while performing Assemble to avoid races with udev Jes.Sorensen
  2 siblings, 0 replies; 7+ messages in thread
From: Jes.Sorensen @ 2011-10-20 10:06 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, lukasz.dorau, adam.kwolek, dledford

From: Jes Sorensen <Jes.Sorensen@redhat.com>

This moves the lock handling out of Incremental_container() and relies
on the caller holding the lock. This prevents conflict with a
follow-on mdadm comment which may try and launch the device in
parallel.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
---
 Incremental.c |   26 ++++++++++++++------------
 1 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/Incremental.c b/Incremental.c
index b90089b..4a5e568 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -139,10 +139,16 @@ int Incremental(char *devname, int verbose, int runstop,
 			rv = st->ss->load_container(st, dfd, NULL);
 
 		close(dfd);
-		if (!rv && st->ss->container_content)
-			return Incremental_container(st, devname, homehost,
-						     verbose, runstop, autof,
-						     freeze_reshape);
+		if (!rv && st->ss->container_content) {
+			if (map_lock(&map))
+				fprintf(stderr, Name ": failed to get "
+					"exclusive lock on mapfile\n");
+			rv = Incremental_container(st, devname, homehost,
+						   verbose, runstop, autof,
+						   freeze_reshape);
+			map_unlock(&map);
+			return rv;
+		}
 
 		fprintf(stderr, Name ": %s is not part of an md array.\n",
 			devname);
@@ -440,7 +446,6 @@ int Incremental(char *devname, int verbose, int runstop,
 	if (info.array.level == LEVEL_CONTAINER) {
 		int devnum = devnum; /* defined and used iff ->external */
 		/* Try to assemble within the container */
-		map_unlock(&map);
 		sysfs_uevent(&info, "change");
 		if (verbose >= 0)
 			fprintf(stderr, Name
@@ -451,9 +456,10 @@ int Incremental(char *devname, int verbose, int runstop,
 			devnum = fd2devnum(mdfd);
 		close(mdfd);
 		sysfs_free(sra);
-		rv = Incremental(chosen_name, verbose, runstop,
-				 NULL, homehost, require_homehost, autof,
-				 freeze_reshape);
+		rv = Incremental_container(st, chosen_name, homehost,
+					   verbose, runstop, autof,
+					   freeze_reshape);
+		map_unlock(&map);
 		if (rv == 1)
 			/* Don't fail the whole -I if a subarray didn't
 			 * have enough devices to start yet
@@ -1463,9 +1469,6 @@ static int Incremental_container(struct supertype *st, char *devname,
 		trustworthy = FOREIGN;
 
 	list = st->ss->container_content(st, NULL);
-	if (map_lock(&map))
-		fprintf(stderr, Name ": failed to get exclusive lock on "
-			"mapfile\n");
 	/* do not assemble arrays that might have bad blocks */
 	if (list && list->array.state & (1<<MD_SB_BBM_ERRORS)) {
 		fprintf(stderr, Name ": BBM log found in metadata. "
@@ -1607,7 +1610,6 @@ static int Incremental_container(struct supertype *st, char *devname,
 			close(sfd);
 	}
 	domain_free(domains);
-	map_unlock(&map);
 	return 0;
 }
 
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] Don't tell sysfs to launch the container as we are doing it ourselves
  2011-10-20 10:06 [PATCH 0/3] Fix mdadm vs udev race in Incremental and Assemble Jes.Sorensen
  2011-10-20 10:06 ` [PATCH 1/3] Remove race for starting container devices Jes.Sorensen
@ 2011-10-20 10:06 ` Jes.Sorensen
  2011-10-20 10:06 ` [PATCH 3/3] Hold the map lock while performing Assemble to avoid races with udev Jes.Sorensen
  2 siblings, 0 replies; 7+ messages in thread
From: Jes.Sorensen @ 2011-10-20 10:06 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, lukasz.dorau, adam.kwolek, dledford

From: Jes Sorensen <Jes.Sorensen@redhat.com>

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
---
 Incremental.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/Incremental.c b/Incremental.c
index 4a5e568..f5f0f33 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -446,7 +446,6 @@ int Incremental(char *devname, int verbose, int runstop,
 	if (info.array.level == LEVEL_CONTAINER) {
 		int devnum = devnum; /* defined and used iff ->external */
 		/* Try to assemble within the container */
-		sysfs_uevent(&info, "change");
 		if (verbose >= 0)
 			fprintf(stderr, Name
 				": container %s now has %d devices\n",
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] Hold the map lock while performing Assemble to avoid races with udev
  2011-10-20 10:06 [PATCH 0/3] Fix mdadm vs udev race in Incremental and Assemble Jes.Sorensen
  2011-10-20 10:06 ` [PATCH 1/3] Remove race for starting container devices Jes.Sorensen
  2011-10-20 10:06 ` [PATCH 2/3] Don't tell sysfs to launch the container as we are doing it ourselves Jes.Sorensen
@ 2011-10-20 10:06 ` Jes.Sorensen
  2 siblings, 0 replies; 7+ messages in thread
From: Jes.Sorensen @ 2011-10-20 10:06 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, lukasz.dorau, adam.kwolek, dledford

From: Jes Sorensen <Jes.Sorensen@redhat.com>

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
---
 mdadm.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/mdadm.c b/mdadm.c
index ebf1c46..56de7b7 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -1279,6 +1279,7 @@ int main(int argc, char *argv[])
 		} else {
 			struct mddev_ident *a, *array_list =  conf_get_ident(NULL);
 			struct mddev_dev *devlist = conf_get_devs();
+			struct map_ent *map = NULL;
 			int cnt = 0;
 			int failures, successes;
 			if (devlist == NULL) {
@@ -1298,6 +1299,10 @@ int main(int argc, char *argv[])
 				if (a->autof == 0)
 					a->autof = autof;
 			}
+			if (map_lock(&map))
+				fprintf(stderr, Name " %s: failed to get "
+					"exclusive lock on mapfile\n",
+					__func__);
 			do {
 				failures = 0;
 				successes = 0;
@@ -1364,6 +1369,7 @@ int main(int argc, char *argv[])
 				fprintf(stderr, Name ": No arrays found in config file\n");
 				rv = 1;
 			}
+			map_unlock(&map);
 		}
 		break;
 	case BUILD:
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 1/3] Remove race for starting container devices.
  2011-10-21 13:33 [PATCH v2 0/3] Fix mdadm vs udev race in Incremental and Assemble Jes.Sorensen
@ 2011-10-21 13:33 ` Jes.Sorensen
  2011-10-22  0:49   ` NeilBrown
  0 siblings, 1 reply; 7+ messages in thread
From: Jes.Sorensen @ 2011-10-21 13:33 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, lukasz.dorau, adam.kwolek, dledford

From: Jes Sorensen <Jes.Sorensen@redhat.com>

This moves the lock handling out of Incremental_container() and relies
on the caller holding the lock. This prevents conflict with a
follow-on mdadm comment which may try and launch the device in
parallel.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
---
 Incremental.c |   29 ++++++++++++++---------------
 1 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/Incremental.c b/Incremental.c
index b90089b..1a9a97b 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -139,10 +139,16 @@ int Incremental(char *devname, int verbose, int runstop,
 			rv = st->ss->load_container(st, dfd, NULL);
 
 		close(dfd);
-		if (!rv && st->ss->container_content)
-			return Incremental_container(st, devname, homehost,
-						     verbose, runstop, autof,
-						     freeze_reshape);
+		if (!rv && st->ss->container_content) {
+			if (map_lock(&map))
+				fprintf(stderr, Name ": failed to get "
+					"exclusive lock on mapfile\n");
+			rv = Incremental_container(st, devname, homehost,
+						   verbose, runstop, autof,
+						   freeze_reshape);
+			map_unlock(&map);
+			return rv;
+		}
 
 		fprintf(stderr, Name ": %s is not part of an md array.\n",
 			devname);
@@ -440,7 +446,6 @@ int Incremental(char *devname, int verbose, int runstop,
 	if (info.array.level == LEVEL_CONTAINER) {
 		int devnum = devnum; /* defined and used iff ->external */
 		/* Try to assemble within the container */
-		map_unlock(&map);
 		sysfs_uevent(&info, "change");
 		if (verbose >= 0)
 			fprintf(stderr, Name
@@ -451,9 +456,10 @@ int Incremental(char *devname, int verbose, int runstop,
 			devnum = fd2devnum(mdfd);
 		close(mdfd);
 		sysfs_free(sra);
-		rv = Incremental(chosen_name, verbose, runstop,
-				 NULL, homehost, require_homehost, autof,
-				 freeze_reshape);
+		rv = Incremental_container(st, chosen_name, homehost,
+					   verbose, runstop, autof,
+					   freeze_reshape);
+		map_unlock(&map);
 		if (rv == 1)
 			/* Don't fail the whole -I if a subarray didn't
 			 * have enough devices to start yet
@@ -1463,16 +1469,12 @@ static int Incremental_container(struct supertype *st, char *devname,
 		trustworthy = FOREIGN;
 
 	list = st->ss->container_content(st, NULL);
-	if (map_lock(&map))
-		fprintf(stderr, Name ": failed to get exclusive lock on "
-			"mapfile\n");
 	/* do not assemble arrays that might have bad blocks */
 	if (list && list->array.state & (1<<MD_SB_BBM_ERRORS)) {
 		fprintf(stderr, Name ": BBM log found in metadata. "
 					"Cannot activate array(s).\n");
 		/* free container data and exit */
 		sysfs_free(list);
-		map_unlock(&map);
 		return 2;
 	}
 
@@ -1536,7 +1538,6 @@ static int Incremental_container(struct supertype *st, char *devname,
 					fprintf(stderr, Name ": array %s/%s is "
 						"explicitly ignored by mdadm.conf\n",
 						match->container, match->member);
-				map_unlock(&map);
 				return 2;
 			}
 			if (match)
@@ -1552,7 +1553,6 @@ static int Incremental_container(struct supertype *st, char *devname,
 		if (mdfd < 0) {
 			fprintf(stderr, Name ": failed to open %s: %s.\n",
 				chosen_name, strerror(errno));
-			map_unlock(&map);
 			return 2;
 		}
 
@@ -1607,7 +1607,6 @@ static int Incremental_container(struct supertype *st, char *devname,
 			close(sfd);
 	}
 	domain_free(domains);
-	map_unlock(&map);
 	return 0;
 }
 
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] Remove race for starting container devices.
  2011-10-21 13:33 ` [PATCH 1/3] Remove race for starting container devices Jes.Sorensen
@ 2011-10-22  0:49   ` NeilBrown
  2011-10-22  8:22     ` Jes Sorensen
  0 siblings, 1 reply; 7+ messages in thread
From: NeilBrown @ 2011-10-22  0:49 UTC (permalink / raw)
  To: Jes.Sorensen; +Cc: linux-raid, lukasz.dorau, adam.kwolek, dledford

[-- Attachment #1: Type: text/plain, Size: 1659 bytes --]

On Fri, 21 Oct 2011 15:33:18 +0200 Jes.Sorensen@redhat.com wrote:

> From: Jes Sorensen <Jes.Sorensen@redhat.com>
> 
> This moves the lock handling out of Incremental_container() and relies
> on the caller holding the lock. This prevents conflict with a
> follow-on mdadm comment which may try and launch the device in
> parallel.
> 
> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>

Hi Jes,

 I have applied the 7 patches you sent.

However I have question mark over part of this one:


> @@ -451,9 +456,10 @@ int Incremental(char *devname, int verbose, int runstop,
>  			devnum = fd2devnum(mdfd);
>  		close(mdfd);
>  		sysfs_free(sra);
> -		rv = Incremental(chosen_name, verbose, runstop,
> -				 NULL, homehost, require_homehost, autof,
> -				 freeze_reshape);
> +		rv = Incremental_container(st, chosen_name, homehost,
> +					   verbose, runstop, autof,
> +					   freeze_reshape);
> +		map_unlock(&map);
>  		if (rv == 1)
>  			/* Don't fail the whole -I if a subarray didn't
>  			 * have enough devices to start yet

You have replaced a call to Incremental with a call to Incremental_container.
I feel that is significant enough that I would have liked to have seen it
discussed in the changelog comment.

You seem have have open-coded Incremental and then removed all the bits that
you don't need in this case - which is that vast majority of it.
But you didn't include the call to ->load_content().

So I have put it back in because I'm sure it must be harmless (Because it was
there already) but it may not be needed.

If you have thoughts on this, please let me know.

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] Remove race for starting container devices.
  2011-10-22  0:49   ` NeilBrown
@ 2011-10-22  8:22     ` Jes Sorensen
  0 siblings, 0 replies; 7+ messages in thread
From: Jes Sorensen @ 2011-10-22  8:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, lukasz.dorau, adam.kwolek, dledford

On 10/22/11 02:49, NeilBrown wrote:
> Hi Jes,
> 
>  I have applied the 7 patches you sent.
> 
> However I have question mark over part of this one:
> 
> 
>> > @@ -451,9 +456,10 @@ int Incremental(char *devname, int verbose, int runstop,
>> >  			devnum = fd2devnum(mdfd);
>> >  		close(mdfd);
>> >  		sysfs_free(sra);
>> > -		rv = Incremental(chosen_name, verbose, runstop,
>> > -				 NULL, homehost, require_homehost, autof,
>> > -				 freeze_reshape);
>> > +		rv = Incremental_container(st, chosen_name, homehost,
>> > +					   verbose, runstop, autof,
>> > +					   freeze_reshape);
>> > +		map_unlock(&map);
>> >  		if (rv == 1)
>> >  			/* Don't fail the whole -I if a subarray didn't
>> >  			 * have enough devices to start yet
> You have replaced a call to Incremental with a call to Incremental_container.
> I feel that is significant enough that I would have liked to have seen it
> discussed in the changelog comment.
> 
> You seem have have open-coded Incremental and then removed all the bits that
> you don't need in this case - which is that vast majority of it.
> But you didn't include the call to ->load_content().
> 
> So I have put it back in because I'm sure it must be harmless (Because it was
> there already) but it may not be needed.
> 
> If you have thoughts on this, please let me know.

Hi Neil,

There is a perfectly good explanation for this: The reason is that the
guy who wrote the initial patch is an idiot!

In fact, said idiot spent most of Friday pulling his hairs out trying to
figure out why the modified mdadm would boot fine on a systemd based
system but fail on a sysvinit based system. The result being the patch
bombing you were exposed to yesterday :)

Thanks for catching this, much appreciated!

Jes

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-10-22  8:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-20 10:06 [PATCH 0/3] Fix mdadm vs udev race in Incremental and Assemble Jes.Sorensen
2011-10-20 10:06 ` [PATCH 1/3] Remove race for starting container devices Jes.Sorensen
2011-10-20 10:06 ` [PATCH 2/3] Don't tell sysfs to launch the container as we are doing it ourselves Jes.Sorensen
2011-10-20 10:06 ` [PATCH 3/3] Hold the map lock while performing Assemble to avoid races with udev Jes.Sorensen
  -- strict thread matches above, loose matches on Subject: below --
2011-10-21 13:33 [PATCH v2 0/3] Fix mdadm vs udev race in Incremental and Assemble Jes.Sorensen
2011-10-21 13:33 ` [PATCH 1/3] Remove race for starting container devices Jes.Sorensen
2011-10-22  0:49   ` NeilBrown
2011-10-22  8:22     ` Jes Sorensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).