[PATCH 0/3] A few more md patches for 2.6.26

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] A few more md patches for 2.6.26
@ 2008-04-30  1:19 Dan Williams
  2008-04-30  1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Dan Williams @ 2008-04-30  1:19 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid

Hi Neil,

The first two patches add some necessary sysfs_notify calls for external
metadata arrays. They replace
md-ping-userspace-on-ext-metadata-events.patch which, as you noted, did
these notifications globally in md_check_recovery() rather than at the
local site where the state changes.

While testing these I ran into:

   WARNING: at include/linux/blkdev.h:443 queue_flag_clear+0x2c/0x4d()

...due to the recent "block: make queue flags non-atomic" (commit:
75ad23bc0fcb4f992a5d06982bf0857ab1738e9e).  I believe patch 3/3 is the
intended fix. 

---

Dan Williams (3):
      md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
      md: ping userspace on 'stop' events
      md: ping userspace on 'write-pending' events

 drivers/md/md.c           |   25 +++++++++++++++++++++----
 drivers/md/raid1.c        |   10 ++++++++--
 drivers/md/raid10.c       |   10 ++++++++--
 drivers/md/raid5.c        |   12 ++++++++++--
 include/linux/raid/md_k.h |    3 +++
 5 files changed, 50 insertions(+), 10 deletions(-)

-- 
Regards,
Dan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] md: ping userspace on 'write-pending' events
  2008-04-30  1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams
@ 2008-04-30  1:19 ` Dan Williams
  2008-05-01  4:44   ` Neil Brown
  2008-04-30  1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams
  2008-04-30  1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams
  2 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-04-30  1:19 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid

Also, when userspace writes 'active', clear all mddev->flags to satisfy
md_write_start (the other bits do not matter to external-metadata arrays).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---

 drivers/md/md.c |   19 +++++++++++++++----
 1 files changed, 15 insertions(+), 4 deletions(-)


diff --git a/drivers/md/md.c b/drivers/md/md.c
index 6859bcd..ad53035 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2664,7 +2664,7 @@ array_state_store(mddev_t *mddev, const char *buf, size_t len)
 		if (mddev->pers) {
 			restart_array(mddev);
 			if (mddev->external)
-				clear_bit(MD_CHANGE_CLEAN, &mddev->flags);
+				mddev->flags = 0;
 			wake_up(&mddev->sb_wait);
 			err = 0;
 		} else {
@@ -5413,9 +5413,14 @@ void md_write_start(mddev_t *mddev, struct bio *bi)
 		if (mddev->in_sync) {
 			mddev->in_sync = 0;
 			set_bit(MD_CHANGE_CLEAN, &mddev->flags);
+			spin_unlock_irq(&mddev->write_lock);
+
+			/* notify userspace to handle clean->dirty */
+			if (mddev->external)
+				sysfs_notify(&mddev->kobj, NULL, "array_state");
 			md_wakeup_thread(mddev->thread);
-		}
-		spin_unlock_irq(&mddev->write_lock);
+		} else
+			spin_unlock_irq(&mddev->write_lock);
 	}
 	wait_event(mddev->sb_wait, mddev->flags==0);
 }
@@ -5451,7 +5456,13 @@ void md_allow_write(mddev_t *mddev)
 		    mddev->safemode == 0)
 			mddev->safemode = 1;
 		spin_unlock_irq(&mddev->write_lock);
-		md_update_sb(mddev, 0);
+
+		/* wait for the dirty state to be recorded in the metadata */
+		if (mddev->external) {
+			sysfs_notify(&mddev->kobj, NULL, "array_state");
+			wait_event(mddev->sb_wait, mddev->flags == 0);
+		} else
+			md_update_sb(mddev, 0);
 	} else
 		spin_unlock_irq(&mddev->write_lock);
 }


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events
  2008-04-30  1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
@ 2008-05-01  4:44   ` Neil Brown
  2008-05-01 20:57     ` Dan Williams
  0 siblings, 1 reply; 11+ messages in thread
From: Neil Brown @ 2008-05-01  4:44 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid

On Tuesday April 29, dan.j.williams@intel.com wrote:
> Also, when userspace writes 'active', clear all mddev->flags to satisfy
> md_write_start (the other bits do not matter to external-metadata arrays).
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Thanks Dan.

There are two things that I don't like about this patch.

1/ I don't think clearing all the flags in array_state_store is really
   right.   You do that to make sure that MD_CHANGE_DEVS is clear, but
   there is no guarantee that it won't be set again before it is
   tested in md_write_start.
   I think it is best to just get md_write_start (and md_allow_write)
   to just test the bits they are really interested in.

2/ having the tests of mddev->external isn't necessary.  I really want
   "array_state" to get a notification whenever it changes, no matter
   what sort of metadata is being used.

Change the patch based on those observations I get:

---------------------------------
Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2008-04-29 12:27:58.000000000 +1000
+++ ./drivers/md/md.c	2008-05-01 14:35:16.000000000 +1000
@@ -5434,8 +5434,11 @@ void md_write_start(mddev_t *mddev, stru
 			md_wakeup_thread(mddev->thread);
 		}
 		spin_unlock_irq(&mddev->write_lock);
+		sysfs_notify(&mddev->kobj, NULL, "array_state");
 	}
-	wait_event(mddev->sb_wait, mddev->flags==0);
+	wait_event(mddev->sb_wait,
+		   !test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
+		   !test_bit(MD_CHANGE_PENDING, &mddev->flags));
 }
 
 void md_write_end(mddev_t *mddev)
@@ -5470,6 +5473,12 @@ void md_allow_write(mddev_t *mddev)
 			mddev->safemode = 1;
 		spin_unlock_irq(&mddev->write_lock);
 		md_update_sb(mddev, 0);
+
+		sysfs_notify(&mddev->kobj, NULL, "array_state");
+		/* wait for the dirty state to be recorded in the metadata */
+		wait_event(mddev->sb_wait,
+			   !test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
+			   !test_bit(MD_CHANGE_PENDING, &mddev->flags));
 	} else
 		spin_unlock_irq(&mddev->write_lock);
 }
----------------------------

Look OK ??

Thanks,
NeilBrown


> ---
> 
>  drivers/md/md.c |   19 +++++++++++++++----
>  1 files changed, 15 insertions(+), 4 deletions(-)
> 
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 6859bcd..ad53035 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -2664,7 +2664,7 @@ array_state_store(mddev_t *mddev, const char *buf, size_t len)
>  		if (mddev->pers) {
>  			restart_array(mddev);
>  			if (mddev->external)
> -				clear_bit(MD_CHANGE_CLEAN, &mddev->flags);
> +				mddev->flags = 0;
>  			wake_up(&mddev->sb_wait);
>  			err = 0;
>  		} else {
> @@ -5413,9 +5413,14 @@ void md_write_start(mddev_t *mddev, struct bio *bi)
>  		if (mddev->in_sync) {
>  			mddev->in_sync = 0;
>  			set_bit(MD_CHANGE_CLEAN, &mddev->flags);
> +			spin_unlock_irq(&mddev->write_lock);
> +
> +			/* notify userspace to handle clean->dirty */
> +			if (mddev->external)
> +				sysfs_notify(&mddev->kobj, NULL, "array_state");
>  			md_wakeup_thread(mddev->thread);
> -		}
> -		spin_unlock_irq(&mddev->write_lock);
> +		} else
> +			spin_unlock_irq(&mddev->write_lock);
>  	}
>  	wait_event(mddev->sb_wait, mddev->flags==0);
>  }
> @@ -5451,7 +5456,13 @@ void md_allow_write(mddev_t *mddev)
>  		    mddev->safemode == 0)
>  			mddev->safemode = 1;
>  		spin_unlock_irq(&mddev->write_lock);
> -		md_update_sb(mddev, 0);
> +
> +		/* wait for the dirty state to be recorded in the metadata */
> +		if (mddev->external) {
> +			sysfs_notify(&mddev->kobj, NULL, "array_state");
> +			wait_event(mddev->sb_wait, mddev->flags == 0);
> +		} else
> +			md_update_sb(mddev, 0);
>  	} else
>  		spin_unlock_irq(&mddev->write_lock);
>  }

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events
  2008-05-01  4:44   ` Neil Brown
@ 2008-05-01 20:57     ` Dan Williams
  2008-05-02  2:10       ` Neil Brown
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-05-01 20:57 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Wed, Apr 30, 2008 at 9:44 PM, Neil Brown <neilb@suse.de> wrote:
> On Tuesday April 29, dan.j.williams@intel.com wrote:
>  > Also, when userspace writes 'active', clear all mddev->flags to satisfy
>  > md_write_start (the other bits do not matter to external-metadata arrays).
>  >
>  > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
>  Thanks Dan.
>
>  There are two things that I don't like about this patch.
>
>  1/ I don't think clearing all the flags in array_state_store is really
>    right.   You do that to make sure that MD_CHANGE_DEVS is clear, but
>    there is no guarantee that it won't be set again before it is
>    tested in md_write_start.
>    I think it is best to just get md_write_start (and md_allow_write)
>    to just test the bits they are really interested in.
>
>  2/ having the tests of mddev->external isn't necessary.  I really want
>    "array_state" to get a notification whenever it changes, no matter
>    what sort of metadata is being used.
>
>  Change the patch based on those observations I get:
>
>  ---------------------------------
>  Signed-off-by: Neil Brown <neilb@suse.de>
>
>  ### Diffstat output
>   ./drivers/md/md.c |   11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
>
>  diff .prev/drivers/md/md.c ./drivers/md/md.c
>  --- .prev/drivers/md/md.c       2008-04-29 12:27:58.000000000 +1000
>  +++ ./drivers/md/md.c   2008-05-01 14:35:16.000000000 +1000
>  @@ -5434,8 +5434,11 @@ void md_write_start(mddev_t *mddev, stru
>
>                         md_wakeup_thread(mddev->thread);
>                 }
>                 spin_unlock_irq(&mddev->write_lock);
>
> +               sysfs_notify(&mddev->kobj, NULL, "array_state");
>         }
>  -       wait_event(mddev->sb_wait, mddev->flags==0);
>  +       wait_event(mddev->sb_wait,
>  +                  !test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
>  +                  !test_bit(MD_CHANGE_PENDING, &mddev->flags));
>   }
>
>   void md_write_end(mddev_t *mddev)
>  @@ -5470,6 +5473,12 @@ void md_allow_write(mddev_t *mddev)
>
>                         mddev->safemode = 1;
>                 spin_unlock_irq(&mddev->write_lock);
>                 md_update_sb(mddev, 0);
>  +
>
> +               sysfs_notify(&mddev->kobj, NULL, "array_state");
>
> +               /* wait for the dirty state to be recorded in the metadata */
>  +               wait_event(mddev->sb_wait,
>  +                          !test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
>  +                          !test_bit(MD_CHANGE_PENDING, &mddev->flags));
>
>         } else
>                 spin_unlock_irq(&mddev->write_lock);
>   }
>  ----------------------------
>
>  Look OK ??
>

[ thinking out loud here to make sure we are on the same page ]

I had to convince myself that it is still holding off writes while
MD_CHANGE_DEVS is set.  And yes, MD_CHANGE_PENDING is not cleared
until after MD_CHANGE_DEVS is handled.  So, the mddev->flags==0 test
was indeed more restrictive than it needed to be.  Adding the
unqualified wait_event() to md_allow_write() also makes sense since it
is needed in the 'external' case and should be a nop in the
'!external' case since those bits are cleared in md_update_sb().

Simple test confirms this is working as expected.

--
Dan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events
  2008-05-01 20:57     ` Dan Williams
@ 2008-05-02  2:10       ` Neil Brown
  0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2008-05-02  2:10 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid

On Thursday May 1, dan.j.williams@intel.com wrote:
> 
> [ thinking out loud here to make sure we are on the same page ]
> 
> I had to convince myself that it is still holding off writes while
> MD_CHANGE_DEVS is set.  And yes, MD_CHANGE_PENDING is not cleared
> until after MD_CHANGE_DEVS is handled.  So, the mddev->flags==0 test
> was indeed more restrictive than it needed to be.  Adding the
> unqualified wait_event() to md_allow_write() also makes sense since it
> is needed in the 'external' case and should be a nop in the
> '!external' case since those bits are cleared in md_update_sb().
> 
> Simple test confirms this is working as expected.

Good, thanks.

NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/3] md: ping userspace on 'stop' events
  2008-04-30  1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams
  2008-04-30  1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
@ 2008-04-30  1:19 ` Dan Williams
  2008-05-01  4:50   ` Neil Brown
  2008-04-30  1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams
  2 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-04-30  1:19 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid

This additional notification to 'array_state' is needed to allow the monitor
application to learn about stop events via sysfs.  The
sysfs_notify("sync_action") call that comes at the end of do_md_stop() (via
md_new_event) is insufficient since the 'sync_action' attribute has been
removed by this point.

(Seems like a sysfs-notify-on-removal patch is a better fix.  Currently removal
 updates the event count but does not wake up waiters)

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---

 drivers/md/md.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)


diff --git a/drivers/md/md.c b/drivers/md/md.c
index ad53035..606c8ee 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3693,6 +3693,9 @@ static int do_md_stop(mddev_t * mddev, int mode)
 
 			module_put(mddev->pers->owner);
 			mddev->pers = NULL;
+			/* tell userspace to handle 'inactive' */
+			if (mddev->external)
+				sysfs_notify(&mddev->kobj, NULL, "array_state");
 
 			set_capacity(disk, 0);
 			mddev->changed = 1;


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] md: ping userspace on 'stop' events
  2008-04-30  1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams
@ 2008-05-01  4:50   ` Neil Brown
  0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2008-05-01  4:50 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid

On Tuesday April 29, dan.j.williams@intel.com wrote:
> This additional notification to 'array_state' is needed to allow the monitor
> application to learn about stop events via sysfs.  The
> sysfs_notify("sync_action") call that comes at the end of do_md_stop() (via
> md_new_event) is insufficient since the 'sync_action' attribute has been
> removed by this point.
> 
> (Seems like a sysfs-notify-on-removal patch is a better fix.  Currently removal
>  updates the event count but does not wake up waiters)
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---

Thanks.

Again, drop the "if (mddev->external)".  It isn't needed (I have made
this change).

NeilBrown

> 
>  drivers/md/md.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index ad53035..606c8ee 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -3693,6 +3693,9 @@ static int do_md_stop(mddev_t * mddev, int mode)
>  
>  			module_put(mddev->pers->owner);
>  			mddev->pers = NULL;
> +			/* tell userspace to handle 'inactive' */
> +			if (mddev->external)
> +				sysfs_notify(&mddev->kobj, NULL, "array_state");
>  
>  			set_capacity(disk, 0);
>  			mddev->changed = 1;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
  2008-04-30  1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams
  2008-04-30  1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
  2008-04-30  1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams
@ 2008-04-30  1:19 ` Dan Williams
  2008-05-01  0:36   ` Neil Brown
  2 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-04-30  1:19 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid

Now that queue flags are no longer atomic (commit:
75ad23bc0fcb4f992a5d06982bf0857ab1738e9e)  we must protect calls to
blk_remove_plug with spin_lock(q->queue_lock).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---

 drivers/md/md.c           |    3 +++
 drivers/md/raid1.c        |   10 ++++++++--
 drivers/md/raid10.c       |   10 ++++++++--
 drivers/md/raid5.c        |   12 ++++++++++--
 include/linux/raid/md_k.h |    3 +++
 5 files changed, 32 insertions(+), 6 deletions(-)


diff --git a/drivers/md/md.c b/drivers/md/md.c
index 606c8ee..e389978 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -283,6 +283,9 @@ static mddev_t * mddev_find(dev_t unit)
 		kfree(new);
 		return NULL;
 	}
+	spin_lock_init(&new->queue_lock);
+	new->queue->queue_lock = &new->queue_lock;
+
 	/* Can be unlocked because the queue is new: no concurrency */
 	queue_flag_set_unlocked(QUEUE_FLAG_CLUSTER, new->queue);
 
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 6778b7c..79f19b1 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -603,9 +603,13 @@ static int flush_pending_writes(conf_t *conf)
 	spin_lock_irq(&conf->device_lock);
 
 	if (conf->pending_bio_list.head) {
+		struct request_queue *q = conf->mddev->queue;
 		struct bio *bio;
+
 		bio = bio_list_get(&conf->pending_bio_list);
-		blk_remove_plug(conf->mddev->queue);
+		spin_lock(q->queue_lock);
+		blk_remove_plug(q);
+		spin_unlock(q->queue_lock);
 		spin_unlock_irq(&conf->device_lock);
 		/* flush any pending bitmap writes to
 		 * disk before proceeding w/ I/O */
@@ -964,7 +968,9 @@ static int make_request(struct request_queue *q, struct bio * bio)
 	bio_list_merge(&conf->pending_bio_list, &bl);
 	bio_list_init(&bl);
 
-	blk_plug_device(mddev->queue);
+	spin_lock(q->queue_lock);
+	blk_plug_device(q);
+	spin_unlock(q->queue_lock);
 	spin_unlock_irqrestore(&conf->device_lock, flags);
 
 	/* In case raid1d snuck into freeze_array */
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 5938fa9..3ecd437 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -646,9 +646,13 @@ static int flush_pending_writes(conf_t *conf)
 	spin_lock_irq(&conf->device_lock);
 
 	if (conf->pending_bio_list.head) {
+		struct request_queue *q = conf->mddev->queue;
 		struct bio *bio;
+
 		bio = bio_list_get(&conf->pending_bio_list);
-		blk_remove_plug(conf->mddev->queue);
+		spin_lock(q->queue_lock);
+		blk_remove_plug(q);
+		spin_unlock(q->queue_lock);
 		spin_unlock_irq(&conf->device_lock);
 		/* flush any pending bitmap writes to disk
 		 * before proceeding w/ I/O */
@@ -955,7 +959,9 @@ static int make_request(struct request_queue *q, struct bio * bio)
 	bitmap_startwrite(mddev->bitmap, bio->bi_sector, r10_bio->sectors, 0);
 	spin_lock_irqsave(&conf->device_lock, flags);
 	bio_list_merge(&conf->pending_bio_list, &bl);
-	blk_plug_device(mddev->queue);
+	spin_lock(q->queue_lock);
+	blk_plug_device(q);
+	spin_unlock(q->queue_lock);
 	spin_unlock_irqrestore(&conf->device_lock, flags);
 
 	/* In case raid10d snuck in to freeze_array */
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e057944..b64a545 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -124,17 +124,23 @@ static void print_raid5_conf (raid5_conf_t *conf);
 
 static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh)
 {
+	struct request_queue *q = conf->mddev->queue;
+
 	if (atomic_dec_and_test(&sh->count)) {
 		BUG_ON(!list_empty(&sh->lru));
 		BUG_ON(atomic_read(&conf->active_stripes)==0);
 		if (test_bit(STRIPE_HANDLE, &sh->state)) {
 			if (test_bit(STRIPE_DELAYED, &sh->state)) {
 				list_add_tail(&sh->lru, &conf->delayed_list);
-				blk_plug_device(conf->mddev->queue);
+				spin_lock(q->queue_lock);
+				blk_plug_device(q);
+				spin_unlock(q->queue_lock);
 			} else if (test_bit(STRIPE_BIT_DELAY, &sh->state) &&
 				   sh->bm_seq - conf->seq_write > 0) {
 				list_add_tail(&sh->lru, &conf->bitmap_list);
-				blk_plug_device(conf->mddev->queue);
+				spin_lock(q->queue_lock);
+				blk_plug_device(q);
+				spin_unlock(q->queue_lock);
 			} else {
 				clear_bit(STRIPE_BIT_DELAY, &sh->state);
 				list_add_tail(&sh->lru, &conf->handle_list);
@@ -3268,10 +3274,12 @@ static void raid5_unplug_device(struct request_queue *q)
 
 	spin_lock_irqsave(&conf->device_lock, flags);
 
+	spin_lock(q->queue_lock);
 	if (blk_remove_plug(q)) {
 		conf->seq_flush++;
 		raid5_activate_delayed(conf);
 	}
+	spin_unlock(q->queue_lock);
 	md_wakeup_thread(mddev->thread);
 
 	spin_unlock_irqrestore(&conf->device_lock, flags);
diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h
index 812ffa5..000a215 100644
--- a/include/linux/raid/md_k.h
+++ b/include/linux/raid/md_k.h
@@ -240,6 +240,9 @@ struct mddev_s
 	struct timer_list		safemode_timer;
 	atomic_t			writes_pending; 
 	struct request_queue		*queue;	/* for plugging ... */
+	spinlock_t			queue_lock; /* we supply the lock
+						     * for blk-core
+						     */
 
 	atomic_t                        write_behind; /* outstanding async IO */
 	unsigned int                    max_write_behind; /* 0 = sync */


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
  2008-04-30  1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams
@ 2008-05-01  0:36   ` Neil Brown
  2008-05-01  0:50     ` Dan Williams
  0 siblings, 1 reply; 11+ messages in thread
From: Neil Brown @ 2008-05-01  0:36 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid

On Tuesday April 29, dan.j.williams@intel.com wrote:
> Now that queue flags are no longer atomic (commit:
> 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e)  we must protect calls to
> blk_remove_plug with spin_lock(q->queue_lock).

Can't we just do

  q->queue_lock = &conf->device_lock

and appropriate places in the various ->run functions?

It seems to be that we are doing appropriate locking, we just need to
convince queue_flag_set / queue_flag_clear that the correct lock is
locked.
??

NeilBrown


> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> 
>  drivers/md/md.c           |    3 +++
>  drivers/md/raid1.c        |   10 ++++++++--
>  drivers/md/raid10.c       |   10 ++++++++--
>  drivers/md/raid5.c        |   12 ++++++++++--
>  include/linux/raid/md_k.h |    3 +++
>  5 files changed, 32 insertions(+), 6 deletions(-)
> 
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 606c8ee..e389978 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -283,6 +283,9 @@ static mddev_t * mddev_find(dev_t unit)
>  		kfree(new);
>  		return NULL;
>  	}
> +	spin_lock_init(&new->queue_lock);
> +	new->queue->queue_lock = &new->queue_lock;
> +
>  	/* Can be unlocked because the queue is new: no concurrency */
>  	queue_flag_set_unlocked(QUEUE_FLAG_CLUSTER, new->queue);
>  
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 6778b7c..79f19b1 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -603,9 +603,13 @@ static int flush_pending_writes(conf_t *conf)
>  	spin_lock_irq(&conf->device_lock);
>  
>  	if (conf->pending_bio_list.head) {
> +		struct request_queue *q = conf->mddev->queue;
>  		struct bio *bio;
> +
>  		bio = bio_list_get(&conf->pending_bio_list);
> -		blk_remove_plug(conf->mddev->queue);
> +		spin_lock(q->queue_lock);
> +		blk_remove_plug(q);
> +		spin_unlock(q->queue_lock);
>  		spin_unlock_irq(&conf->device_lock);
>  		/* flush any pending bitmap writes to
>  		 * disk before proceeding w/ I/O */
> @@ -964,7 +968,9 @@ static int make_request(struct request_queue *q, struct bio * bio)
>  	bio_list_merge(&conf->pending_bio_list, &bl);
>  	bio_list_init(&bl);
>  
> -	blk_plug_device(mddev->queue);
> +	spin_lock(q->queue_lock);
> +	blk_plug_device(q);
> +	spin_unlock(q->queue_lock);
>  	spin_unlock_irqrestore(&conf->device_lock, flags);
>  
>  	/* In case raid1d snuck into freeze_array */
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 5938fa9..3ecd437 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -646,9 +646,13 @@ static int flush_pending_writes(conf_t *conf)
>  	spin_lock_irq(&conf->device_lock);
>  
>  	if (conf->pending_bio_list.head) {
> +		struct request_queue *q = conf->mddev->queue;
>  		struct bio *bio;
> +
>  		bio = bio_list_get(&conf->pending_bio_list);
> -		blk_remove_plug(conf->mddev->queue);
> +		spin_lock(q->queue_lock);
> +		blk_remove_plug(q);
> +		spin_unlock(q->queue_lock);
>  		spin_unlock_irq(&conf->device_lock);
>  		/* flush any pending bitmap writes to disk
>  		 * before proceeding w/ I/O */
> @@ -955,7 +959,9 @@ static int make_request(struct request_queue *q, struct bio * bio)
>  	bitmap_startwrite(mddev->bitmap, bio->bi_sector, r10_bio->sectors, 0);
>  	spin_lock_irqsave(&conf->device_lock, flags);
>  	bio_list_merge(&conf->pending_bio_list, &bl);
> -	blk_plug_device(mddev->queue);
> +	spin_lock(q->queue_lock);
> +	blk_plug_device(q);
> +	spin_unlock(q->queue_lock);
>  	spin_unlock_irqrestore(&conf->device_lock, flags);
>  
>  	/* In case raid10d snuck in to freeze_array */
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index e057944..b64a545 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -124,17 +124,23 @@ static void print_raid5_conf (raid5_conf_t *conf);
>  
>  static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh)
>  {
> +	struct request_queue *q = conf->mddev->queue;
> +
>  	if (atomic_dec_and_test(&sh->count)) {
>  		BUG_ON(!list_empty(&sh->lru));
>  		BUG_ON(atomic_read(&conf->active_stripes)==0);
>  		if (test_bit(STRIPE_HANDLE, &sh->state)) {
>  			if (test_bit(STRIPE_DELAYED, &sh->state)) {
>  				list_add_tail(&sh->lru, &conf->delayed_list);
> -				blk_plug_device(conf->mddev->queue);
> +				spin_lock(q->queue_lock);
> +				blk_plug_device(q);
> +				spin_unlock(q->queue_lock);
>  			} else if (test_bit(STRIPE_BIT_DELAY, &sh->state) &&
>  				   sh->bm_seq - conf->seq_write > 0) {
>  				list_add_tail(&sh->lru, &conf->bitmap_list);
> -				blk_plug_device(conf->mddev->queue);
> +				spin_lock(q->queue_lock);
> +				blk_plug_device(q);
> +				spin_unlock(q->queue_lock);
>  			} else {
>  				clear_bit(STRIPE_BIT_DELAY, &sh->state);
>  				list_add_tail(&sh->lru, &conf->handle_list);
> @@ -3268,10 +3274,12 @@ static void raid5_unplug_device(struct request_queue *q)
>  
>  	spin_lock_irqsave(&conf->device_lock, flags);
>  
> +	spin_lock(q->queue_lock);
>  	if (blk_remove_plug(q)) {
>  		conf->seq_flush++;
>  		raid5_activate_delayed(conf);
>  	}
> +	spin_unlock(q->queue_lock);
>  	md_wakeup_thread(mddev->thread);
>  
>  	spin_unlock_irqrestore(&conf->device_lock, flags);
> diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h
> index 812ffa5..000a215 100644
> --- a/include/linux/raid/md_k.h
> +++ b/include/linux/raid/md_k.h
> @@ -240,6 +240,9 @@ struct mddev_s
>  	struct timer_list		safemode_timer;
>  	atomic_t			writes_pending; 
>  	struct request_queue		*queue;	/* for plugging ... */
> +	spinlock_t			queue_lock; /* we supply the lock
> +						     * for blk-core
> +						     */
>  
>  	atomic_t                        write_behind; /* outstanding async IO */
>  	unsigned int                    max_write_behind; /* 0 = sync */

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
  2008-05-01  0:36   ` Neil Brown
@ 2008-05-01  0:50     ` Dan Williams
  2008-05-01  2:33       ` Neil Brown
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-05-01  0:50 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Wed, Apr 30, 2008 at 5:36 PM, Neil Brown <neilb@suse.de> wrote:
> On Tuesday April 29, dan.j.williams@intel.com wrote:
>  > Now that queue flags are no longer atomic (commit:
>  > 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e)  we must protect calls to
>  > blk_remove_plug with spin_lock(q->queue_lock).
>
>  Can't we just do
>
>   q->queue_lock = &conf->device_lock
>
>  and appropriate places in the various ->run functions?
>
>  It seems to be that we are doing appropriate locking, we just need to
>  convince queue_flag_set / queue_flag_clear that the correct lock is
>  locked.
>  ??
>

That's much simpler.  Hmm... as an additional cleanup should
device_lock then move to mddev_t since it really is a mddev-level
lock?  There seem to be more than a few places that cast
mddev->private just to get conf->device_lock.  Then all this lock
initialization is centralized.

--
Dan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
  2008-05-01  0:50     ` Dan Williams
@ 2008-05-01  2:33       ` Neil Brown
  0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2008-05-01  2:33 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid

On Wednesday April 30, dan.j.williams@intel.com wrote:
> On Wed, Apr 30, 2008 at 5:36 PM, Neil Brown <neilb@suse.de> wrote:
> > On Tuesday April 29, dan.j.williams@intel.com wrote:
> >  > Now that queue flags are no longer atomic (commit:
> >  > 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e)  we must protect calls to
> >  > blk_remove_plug with spin_lock(q->queue_lock).
> >
> >  Can't we just do
> >
> >   q->queue_lock = &conf->device_lock
> >
> >  and appropriate places in the various ->run functions?
> >
> >  It seems to be that we are doing appropriate locking, we just need to
> >  convince queue_flag_set / queue_flag_clear that the correct lock is
> >  locked.
> >  ??
> >
> 
> That's much simpler.  Hmm... as an additional cleanup should
> device_lock then move to mddev_t since it really is a mddev-level
> lock?  There seem to be more than a few places that cast
> mddev->private just to get conf->device_lock.  Then all this lock
> initialization is centralized.

Conversely, there seem to be plenty of places where we access
conf->device_lock and don't have an 'mddev' handy.
They could of course become
      conf->mddev->device_lock
but I'm not sure that is an improvement.

I think you would need to actually do the change, then decide if the
benefits outweigh the costs.

It's probably best to find a minimal solution to the current problem
and just add the assignments to q->queue_lock for now.  We can then
explore whether there are cleanup opportunities.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-05-02  2:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-30  1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams
2008-04-30  1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
2008-05-01  4:44   ` Neil Brown
2008-05-01 20:57     ` Dan Williams
2008-05-02  2:10       ` Neil Brown
2008-04-30  1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams
2008-05-01  4:50   ` Neil Brown
2008-04-30  1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams
2008-05-01  0:36   ` Neil Brown
2008-05-01  0:50     ` Dan Williams
2008-05-01  2:33       ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).