linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Create.c: Try few more times to stop array after failed creation
@ 2014-09-05 14:26 Pawel Baldysiak
  2014-09-08  6:33 ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Pawel Baldysiak @ 2014-09-05 14:26 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid, pawel.baldysiak, artur.paszkiewicz

Sometimes after failure in creation (exp. due to duplicate devices
in create command) newly created empty md array will not be stopped
due to openers>1 (create_mddev will not manage to drop lock).
In this case ioctl() will return error - this needs to be checked
and if occurs - sending STOP_ARRAY should be repeat after delay
to make sure that mddev is stopped correctly.

Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
---
 Create.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Create.c b/Create.c
index 330c5b4..7c8e53e 100644
--- a/Create.c
+++ b/Create.c
@@ -904,7 +904,12 @@ int Create(struct supertype *st, char *mddev,
 				if (st->ss->add_to_super(st, &inf->disk,
 							 fd, dv->devname,
 							 dv->data_offset)) {
-					ioctl(mdfd, STOP_ARRAY, NULL);
+					int count = 5;
+					while (count &&
+					       (ioctl(mdfd, STOP_ARRAY, NULL) < 0)) {
+						usleep(100000);
+						count--;
+					}
 					goto abort_locked;
 				}
 				st->ss->getinfo_super(st, inf, NULL);


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] Create.c: Try few more times to stop array after failed creation
  2014-09-05 14:26 [PATCH] Create.c: Try few more times to stop array after failed creation Pawel Baldysiak
@ 2014-09-08  6:33 ` NeilBrown
  2014-09-09 10:04   ` Baldysiak, Pawel
  0 siblings, 1 reply; 4+ messages in thread
From: NeilBrown @ 2014-09-08  6:33 UTC (permalink / raw)
  To: Pawel Baldysiak; +Cc: linux-raid, artur.paszkiewicz

[-- Attachment #1: Type: text/plain, Size: 2334 bytes --]

On Fri, 05 Sep 2014 16:26:13 +0200 Pawel Baldysiak
<pawel.baldysiak@intel.com> wrote:

> Sometimes after failure in creation (exp. due to duplicate devices
> in create command) newly created empty md array will not be stopped
> due to openers>1 (create_mddev will not manage to drop lock).
> In this case ioctl() will return error - this needs to be checked
> and if occurs - sending STOP_ARRAY should be repeat after delay
> to make sure that mddev is stopped correctly.
> 
> Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
> ---
>  Create.c |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/Create.c b/Create.c
> index 330c5b4..7c8e53e 100644
> --- a/Create.c
> +++ b/Create.c
> @@ -904,7 +904,12 @@ int Create(struct supertype *st, char *mddev,
>  				if (st->ss->add_to_super(st, &inf->disk,
>  							 fd, dv->devname,
>  							 dv->data_offset)) {
> -					ioctl(mdfd, STOP_ARRAY, NULL);
> +					int count = 5;
> +					while (count &&
> +					       (ioctl(mdfd, STOP_ARRAY, NULL) < 0)) {
> +						usleep(100000);
> +						count--;
> +					}
>  					goto abort_locked;
>  				}
>  				st->ss->getinfo_super(st, inf, NULL);

I don't like this.  I don't really like any of the other loops like this that
are already in the code either.  I wonder if we can avoid the need for it.

Given that the array hasn't been started yet, no other process can actually
be *using* the array.  And given that we have an O_EXCL open at this point,
no other process can be trying to stop/start the array.
So it should be safe to change the kernel to not fail in this situation.

If you apply this kernel patch:

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 1294238610df..1bf3fe1ecc79 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5362,7 +5362,7 @@ static int do_md_stop(struct mddev * mddev, int mode,
 	mddev_lock_nointr(mddev);
 
 	mutex_lock(&mddev->open_mutex);
-	if (atomic_read(&mddev->openers) > !!bdev ||
+	if ((mddev->pers && atomic_read(&mddev->openers) > !!bdev) ||
 	    mddev->sysfs_active ||
 	    mddev->sync_thread ||
 	    (bdev && !test_bit(MD_STILL_CLOSED, &mddev->flags))) {


does that fir your problem?  Can you see any reason not to allow STOP_ARRAY
to succeed in this situation?

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH] Create.c: Try few more times to stop array after failed creation
  2014-09-08  6:33 ` NeilBrown
@ 2014-09-09 10:04   ` Baldysiak, Pawel
  2014-09-09 10:16     ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Baldysiak, Pawel @ 2014-09-09 10:04 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid@vger.kernel.org, Paszkiewicz, Artur

> On: Monday, September 08, 2014 8:34 AM NeilBrown wrote:
> To: Baldysiak, Pawel
> Cc: linux-raid@vger.kernel.org; Paszkiewicz, Artur
> Subject: Re: [PATCH] Create.c: Try few more times to stop array after failed
> creation
> 
> On Fri, 05 Sep 2014 16:26:13 +0200 Pawel Baldysiak
> <pawel.baldysiak@intel.com> wrote:
> 
> > Sometimes after failure in creation (exp. due to duplicate devices in
> > create command) newly created empty md array will not be stopped due
> > to openers>1 (create_mddev will not manage to drop lock).
> > In this case ioctl() will return error - this needs to be checked and
> > if occurs - sending STOP_ARRAY should be repeat after delay to make
> > sure that mddev is stopped correctly.
> >
> > Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
> > ---
> >  Create.c |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/Create.c b/Create.c
> > index 330c5b4..7c8e53e 100644
> > --- a/Create.c
> > +++ b/Create.c
> > @@ -904,7 +904,12 @@ int Create(struct supertype *st, char *mddev,
> >  				if (st->ss->add_to_super(st, &inf->disk,
> >  							 fd, dv->devname,
> >  							 dv->data_offset)) {
> > -					ioctl(mdfd, STOP_ARRAY, NULL);
> > +					int count = 5;
> > +					while (count &&
> > +					       (ioctl(mdfd, STOP_ARRAY, NULL)
> < 0)) {
> > +						usleep(100000);
> > +						count--;
> > +					}
> >  					goto abort_locked;
> >  				}
> >  				st->ss->getinfo_super(st, inf, NULL);
> 
> I don't like this.  I don't really like any of the other loops like this that are
> already in the code either.  I wonder if we can avoid the need for it.
> 
> Given that the array hasn't been started yet, no other process can actually be
> *using* the array.  And given that we have an O_EXCL open at this point, no
> other process can be trying to stop/start the array.
> So it should be safe to change the kernel to not fail in this situation.
> 
> If you apply this kernel patch:
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c index
> 1294238610df..1bf3fe1ecc79 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5362,7 +5362,7 @@ static int do_md_stop(struct mddev * mddev, int
> mode,
>  	mddev_lock_nointr(mddev);
> 
>  	mutex_lock(&mddev->open_mutex);
> -	if (atomic_read(&mddev->openers) > !!bdev ||
> +	if ((mddev->pers && atomic_read(&mddev->openers) > !!bdev) ||
>  	    mddev->sysfs_active ||
>  	    mddev->sync_thread ||
>  	    (bdev && !test_bit(MD_STILL_CLOSED, &mddev->flags))) {
> 
> 
> does that fir your problem?  Can you see any reason not to allow
> STOP_ARRAY to succeed in this situation?
> 
Hi Neil
Thanks for your answer.
To fix this problem same thing needs to be added in one more place in kernel:

@@ -6454,7 +6454,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 		 * and writes
 		 */
 		mutex_lock(&mddev->open_mutex);
-		if (atomic_read(&mddev->openers) > 1) {
+		if (mddev->pers && (atomic_read(&mddev->openers) > 1)) {
 			mutex_unlock(&mddev->open_mutex);
 			err = -EBUSY;
 			goto abort;

Should I prepare the patch, or you can do it?

Thanks,
Pawel Baldysiak

> Thanks,
> NeilBrown

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Create.c: Try few more times to stop array after failed creation
  2014-09-09 10:04   ` Baldysiak, Pawel
@ 2014-09-09 10:16     ` NeilBrown
  0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2014-09-09 10:16 UTC (permalink / raw)
  To: Baldysiak, Pawel; +Cc: linux-raid@vger.kernel.org, Paszkiewicz, Artur

[-- Attachment #1: Type: text/plain, Size: 3514 bytes --]

On Tue, 9 Sep 2014 10:04:38 +0000 "Baldysiak, Pawel"
<pawel.baldysiak@intel.com> wrote:

> > On: Monday, September 08, 2014 8:34 AM NeilBrown wrote:
> > To: Baldysiak, Pawel
> > Cc: linux-raid@vger.kernel.org; Paszkiewicz, Artur
> > Subject: Re: [PATCH] Create.c: Try few more times to stop array after failed
> > creation
> > 
> > On Fri, 05 Sep 2014 16:26:13 +0200 Pawel Baldysiak
> > <pawel.baldysiak@intel.com> wrote:
> > 
> > > Sometimes after failure in creation (exp. due to duplicate devices in
> > > create command) newly created empty md array will not be stopped due
> > > to openers>1 (create_mddev will not manage to drop lock).
> > > In this case ioctl() will return error - this needs to be checked and
> > > if occurs - sending STOP_ARRAY should be repeat after delay to make
> > > sure that mddev is stopped correctly.
> > >
> > > Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
> > > ---
> > >  Create.c |    7 ++++++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Create.c b/Create.c
> > > index 330c5b4..7c8e53e 100644
> > > --- a/Create.c
> > > +++ b/Create.c
> > > @@ -904,7 +904,12 @@ int Create(struct supertype *st, char *mddev,
> > >  				if (st->ss->add_to_super(st, &inf->disk,
> > >  							 fd, dv->devname,
> > >  							 dv->data_offset)) {
> > > -					ioctl(mdfd, STOP_ARRAY, NULL);
> > > +					int count = 5;
> > > +					while (count &&
> > > +					       (ioctl(mdfd, STOP_ARRAY, NULL)
> > < 0)) {
> > > +						usleep(100000);
> > > +						count--;
> > > +					}
> > >  					goto abort_locked;
> > >  				}
> > >  				st->ss->getinfo_super(st, inf, NULL);
> > 
> > I don't like this.  I don't really like any of the other loops like this that are
> > already in the code either.  I wonder if we can avoid the need for it.
> > 
> > Given that the array hasn't been started yet, no other process can actually be
> > *using* the array.  And given that we have an O_EXCL open at this point, no
> > other process can be trying to stop/start the array.
> > So it should be safe to change the kernel to not fail in this situation.
> > 
> > If you apply this kernel patch:
> > 
> > diff --git a/drivers/md/md.c b/drivers/md/md.c index
> > 1294238610df..1bf3fe1ecc79 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -5362,7 +5362,7 @@ static int do_md_stop(struct mddev * mddev, int
> > mode,
> >  	mddev_lock_nointr(mddev);
> > 
> >  	mutex_lock(&mddev->open_mutex);
> > -	if (atomic_read(&mddev->openers) > !!bdev ||
> > +	if ((mddev->pers && atomic_read(&mddev->openers) > !!bdev) ||
> >  	    mddev->sysfs_active ||
> >  	    mddev->sync_thread ||
> >  	    (bdev && !test_bit(MD_STILL_CLOSED, &mddev->flags))) {
> > 
> > 
> > does that fir your problem?  Can you see any reason not to allow
> > STOP_ARRAY to succeed in this situation?
> > 
> Hi Neil
> Thanks for your answer.
> To fix this problem same thing needs to be added in one more place in kernel:
> 
> @@ -6454,7 +6454,7 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
>  		 * and writes
>  		 */
>  		mutex_lock(&mddev->open_mutex);
> -		if (atomic_read(&mddev->openers) > 1) {
> +		if (mddev->pers && (atomic_read(&mddev->openers) > 1)) {
>  			mutex_unlock(&mddev->open_mutex);
>  			err = -EBUSY;
>  			goto abort;
> 
> Should I prepare the patch, or you can do it?

I'll do it thanks - I have it half done already.

Thanks for testing.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-09 10:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-05 14:26 [PATCH] Create.c: Try few more times to stop array after failed creation Pawel Baldysiak
2014-09-08  6:33 ` NeilBrown
2014-09-09 10:04   ` Baldysiak, Pawel
2014-09-09 10:16     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).