* [PATCH 0/3] A few more md patches for 2.6.26
@ 2008-04-30 1:19 Dan Williams
2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Dan Williams @ 2008-04-30 1:19 UTC (permalink / raw)
To: neilb; +Cc: linux-raid
Hi Neil,
The first two patches add some necessary sysfs_notify calls for external
metadata arrays. They replace
md-ping-userspace-on-ext-metadata-events.patch which, as you noted, did
these notifications globally in md_check_recovery() rather than at the
local site where the state changes.
While testing these I ran into:
WARNING: at include/linux/blkdev.h:443 queue_flag_clear+0x2c/0x4d()
...due to the recent "block: make queue flags non-atomic" (commit:
75ad23bc0fcb4f992a5d06982bf0857ab1738e9e). I believe patch 3/3 is the
intended fix.
---
Dan Williams (3):
md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
md: ping userspace on 'stop' events
md: ping userspace on 'write-pending' events
drivers/md/md.c | 25 +++++++++++++++++++++----
drivers/md/raid1.c | 10 ++++++++--
drivers/md/raid10.c | 10 ++++++++--
drivers/md/raid5.c | 12 ++++++++++--
include/linux/raid/md_k.h | 3 +++
5 files changed, 50 insertions(+), 10 deletions(-)
--
Regards,
Dan
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/3] md: ping userspace on 'write-pending' events
2008-04-30 1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams
@ 2008-04-30 1:19 ` Dan Williams
2008-05-01 4:44 ` Neil Brown
2008-04-30 1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams
2008-04-30 1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams
2 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-04-30 1:19 UTC (permalink / raw)
To: neilb; +Cc: linux-raid
Also, when userspace writes 'active', clear all mddev->flags to satisfy
md_write_start (the other bits do not matter to external-metadata arrays).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/md/md.c | 19 +++++++++++++++----
1 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 6859bcd..ad53035 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2664,7 +2664,7 @@ array_state_store(mddev_t *mddev, const char *buf, size_t len)
if (mddev->pers) {
restart_array(mddev);
if (mddev->external)
- clear_bit(MD_CHANGE_CLEAN, &mddev->flags);
+ mddev->flags = 0;
wake_up(&mddev->sb_wait);
err = 0;
} else {
@@ -5413,9 +5413,14 @@ void md_write_start(mddev_t *mddev, struct bio *bi)
if (mddev->in_sync) {
mddev->in_sync = 0;
set_bit(MD_CHANGE_CLEAN, &mddev->flags);
+ spin_unlock_irq(&mddev->write_lock);
+
+ /* notify userspace to handle clean->dirty */
+ if (mddev->external)
+ sysfs_notify(&mddev->kobj, NULL, "array_state");
md_wakeup_thread(mddev->thread);
- }
- spin_unlock_irq(&mddev->write_lock);
+ } else
+ spin_unlock_irq(&mddev->write_lock);
}
wait_event(mddev->sb_wait, mddev->flags==0);
}
@@ -5451,7 +5456,13 @@ void md_allow_write(mddev_t *mddev)
mddev->safemode == 0)
mddev->safemode = 1;
spin_unlock_irq(&mddev->write_lock);
- md_update_sb(mddev, 0);
+
+ /* wait for the dirty state to be recorded in the metadata */
+ if (mddev->external) {
+ sysfs_notify(&mddev->kobj, NULL, "array_state");
+ wait_event(mddev->sb_wait, mddev->flags == 0);
+ } else
+ md_update_sb(mddev, 0);
} else
spin_unlock_irq(&mddev->write_lock);
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/3] md: ping userspace on 'stop' events
2008-04-30 1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams
2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
@ 2008-04-30 1:19 ` Dan Williams
2008-05-01 4:50 ` Neil Brown
2008-04-30 1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams
2 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-04-30 1:19 UTC (permalink / raw)
To: neilb; +Cc: linux-raid
This additional notification to 'array_state' is needed to allow the monitor
application to learn about stop events via sysfs. The
sysfs_notify("sync_action") call that comes at the end of do_md_stop() (via
md_new_event) is insufficient since the 'sync_action' attribute has been
removed by this point.
(Seems like a sysfs-notify-on-removal patch is a better fix. Currently removal
updates the event count but does not wake up waiters)
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/md/md.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index ad53035..606c8ee 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3693,6 +3693,9 @@ static int do_md_stop(mddev_t * mddev, int mode)
module_put(mddev->pers->owner);
mddev->pers = NULL;
+ /* tell userspace to handle 'inactive' */
+ if (mddev->external)
+ sysfs_notify(&mddev->kobj, NULL, "array_state");
set_capacity(disk, 0);
mddev->changed = 1;
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
2008-04-30 1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams
2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
2008-04-30 1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams
@ 2008-04-30 1:19 ` Dan Williams
2008-05-01 0:36 ` Neil Brown
2 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-04-30 1:19 UTC (permalink / raw)
To: neilb; +Cc: linux-raid
Now that queue flags are no longer atomic (commit:
75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) we must protect calls to
blk_remove_plug with spin_lock(q->queue_lock).
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/md/md.c | 3 +++
drivers/md/raid1.c | 10 ++++++++--
drivers/md/raid10.c | 10 ++++++++--
drivers/md/raid5.c | 12 ++++++++++--
include/linux/raid/md_k.h | 3 +++
5 files changed, 32 insertions(+), 6 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 606c8ee..e389978 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -283,6 +283,9 @@ static mddev_t * mddev_find(dev_t unit)
kfree(new);
return NULL;
}
+ spin_lock_init(&new->queue_lock);
+ new->queue->queue_lock = &new->queue_lock;
+
/* Can be unlocked because the queue is new: no concurrency */
queue_flag_set_unlocked(QUEUE_FLAG_CLUSTER, new->queue);
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 6778b7c..79f19b1 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -603,9 +603,13 @@ static int flush_pending_writes(conf_t *conf)
spin_lock_irq(&conf->device_lock);
if (conf->pending_bio_list.head) {
+ struct request_queue *q = conf->mddev->queue;
struct bio *bio;
+
bio = bio_list_get(&conf->pending_bio_list);
- blk_remove_plug(conf->mddev->queue);
+ spin_lock(q->queue_lock);
+ blk_remove_plug(q);
+ spin_unlock(q->queue_lock);
spin_unlock_irq(&conf->device_lock);
/* flush any pending bitmap writes to
* disk before proceeding w/ I/O */
@@ -964,7 +968,9 @@ static int make_request(struct request_queue *q, struct bio * bio)
bio_list_merge(&conf->pending_bio_list, &bl);
bio_list_init(&bl);
- blk_plug_device(mddev->queue);
+ spin_lock(q->queue_lock);
+ blk_plug_device(q);
+ spin_unlock(q->queue_lock);
spin_unlock_irqrestore(&conf->device_lock, flags);
/* In case raid1d snuck into freeze_array */
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 5938fa9..3ecd437 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -646,9 +646,13 @@ static int flush_pending_writes(conf_t *conf)
spin_lock_irq(&conf->device_lock);
if (conf->pending_bio_list.head) {
+ struct request_queue *q = conf->mddev->queue;
struct bio *bio;
+
bio = bio_list_get(&conf->pending_bio_list);
- blk_remove_plug(conf->mddev->queue);
+ spin_lock(q->queue_lock);
+ blk_remove_plug(q);
+ spin_unlock(q->queue_lock);
spin_unlock_irq(&conf->device_lock);
/* flush any pending bitmap writes to disk
* before proceeding w/ I/O */
@@ -955,7 +959,9 @@ static int make_request(struct request_queue *q, struct bio * bio)
bitmap_startwrite(mddev->bitmap, bio->bi_sector, r10_bio->sectors, 0);
spin_lock_irqsave(&conf->device_lock, flags);
bio_list_merge(&conf->pending_bio_list, &bl);
- blk_plug_device(mddev->queue);
+ spin_lock(q->queue_lock);
+ blk_plug_device(q);
+ spin_unlock(q->queue_lock);
spin_unlock_irqrestore(&conf->device_lock, flags);
/* In case raid10d snuck in to freeze_array */
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e057944..b64a545 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -124,17 +124,23 @@ static void print_raid5_conf (raid5_conf_t *conf);
static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh)
{
+ struct request_queue *q = conf->mddev->queue;
+
if (atomic_dec_and_test(&sh->count)) {
BUG_ON(!list_empty(&sh->lru));
BUG_ON(atomic_read(&conf->active_stripes)==0);
if (test_bit(STRIPE_HANDLE, &sh->state)) {
if (test_bit(STRIPE_DELAYED, &sh->state)) {
list_add_tail(&sh->lru, &conf->delayed_list);
- blk_plug_device(conf->mddev->queue);
+ spin_lock(q->queue_lock);
+ blk_plug_device(q);
+ spin_unlock(q->queue_lock);
} else if (test_bit(STRIPE_BIT_DELAY, &sh->state) &&
sh->bm_seq - conf->seq_write > 0) {
list_add_tail(&sh->lru, &conf->bitmap_list);
- blk_plug_device(conf->mddev->queue);
+ spin_lock(q->queue_lock);
+ blk_plug_device(q);
+ spin_unlock(q->queue_lock);
} else {
clear_bit(STRIPE_BIT_DELAY, &sh->state);
list_add_tail(&sh->lru, &conf->handle_list);
@@ -3268,10 +3274,12 @@ static void raid5_unplug_device(struct request_queue *q)
spin_lock_irqsave(&conf->device_lock, flags);
+ spin_lock(q->queue_lock);
if (blk_remove_plug(q)) {
conf->seq_flush++;
raid5_activate_delayed(conf);
}
+ spin_unlock(q->queue_lock);
md_wakeup_thread(mddev->thread);
spin_unlock_irqrestore(&conf->device_lock, flags);
diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h
index 812ffa5..000a215 100644
--- a/include/linux/raid/md_k.h
+++ b/include/linux/raid/md_k.h
@@ -240,6 +240,9 @@ struct mddev_s
struct timer_list safemode_timer;
atomic_t writes_pending;
struct request_queue *queue; /* for plugging ... */
+ spinlock_t queue_lock; /* we supply the lock
+ * for blk-core
+ */
atomic_t write_behind; /* outstanding async IO */
unsigned int max_write_behind; /* 0 = sync */
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
2008-04-30 1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams
@ 2008-05-01 0:36 ` Neil Brown
2008-05-01 0:50 ` Dan Williams
0 siblings, 1 reply; 11+ messages in thread
From: Neil Brown @ 2008-05-01 0:36 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-raid
On Tuesday April 29, dan.j.williams@intel.com wrote:
> Now that queue flags are no longer atomic (commit:
> 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) we must protect calls to
> blk_remove_plug with spin_lock(q->queue_lock).
Can't we just do
q->queue_lock = &conf->device_lock
and appropriate places in the various ->run functions?
It seems to be that we are doing appropriate locking, we just need to
convince queue_flag_set / queue_flag_clear that the correct lock is
locked.
??
NeilBrown
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>
> drivers/md/md.c | 3 +++
> drivers/md/raid1.c | 10 ++++++++--
> drivers/md/raid10.c | 10 ++++++++--
> drivers/md/raid5.c | 12 ++++++++++--
> include/linux/raid/md_k.h | 3 +++
> 5 files changed, 32 insertions(+), 6 deletions(-)
>
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 606c8ee..e389978 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -283,6 +283,9 @@ static mddev_t * mddev_find(dev_t unit)
> kfree(new);
> return NULL;
> }
> + spin_lock_init(&new->queue_lock);
> + new->queue->queue_lock = &new->queue_lock;
> +
> /* Can be unlocked because the queue is new: no concurrency */
> queue_flag_set_unlocked(QUEUE_FLAG_CLUSTER, new->queue);
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 6778b7c..79f19b1 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -603,9 +603,13 @@ static int flush_pending_writes(conf_t *conf)
> spin_lock_irq(&conf->device_lock);
>
> if (conf->pending_bio_list.head) {
> + struct request_queue *q = conf->mddev->queue;
> struct bio *bio;
> +
> bio = bio_list_get(&conf->pending_bio_list);
> - blk_remove_plug(conf->mddev->queue);
> + spin_lock(q->queue_lock);
> + blk_remove_plug(q);
> + spin_unlock(q->queue_lock);
> spin_unlock_irq(&conf->device_lock);
> /* flush any pending bitmap writes to
> * disk before proceeding w/ I/O */
> @@ -964,7 +968,9 @@ static int make_request(struct request_queue *q, struct bio * bio)
> bio_list_merge(&conf->pending_bio_list, &bl);
> bio_list_init(&bl);
>
> - blk_plug_device(mddev->queue);
> + spin_lock(q->queue_lock);
> + blk_plug_device(q);
> + spin_unlock(q->queue_lock);
> spin_unlock_irqrestore(&conf->device_lock, flags);
>
> /* In case raid1d snuck into freeze_array */
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 5938fa9..3ecd437 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -646,9 +646,13 @@ static int flush_pending_writes(conf_t *conf)
> spin_lock_irq(&conf->device_lock);
>
> if (conf->pending_bio_list.head) {
> + struct request_queue *q = conf->mddev->queue;
> struct bio *bio;
> +
> bio = bio_list_get(&conf->pending_bio_list);
> - blk_remove_plug(conf->mddev->queue);
> + spin_lock(q->queue_lock);
> + blk_remove_plug(q);
> + spin_unlock(q->queue_lock);
> spin_unlock_irq(&conf->device_lock);
> /* flush any pending bitmap writes to disk
> * before proceeding w/ I/O */
> @@ -955,7 +959,9 @@ static int make_request(struct request_queue *q, struct bio * bio)
> bitmap_startwrite(mddev->bitmap, bio->bi_sector, r10_bio->sectors, 0);
> spin_lock_irqsave(&conf->device_lock, flags);
> bio_list_merge(&conf->pending_bio_list, &bl);
> - blk_plug_device(mddev->queue);
> + spin_lock(q->queue_lock);
> + blk_plug_device(q);
> + spin_unlock(q->queue_lock);
> spin_unlock_irqrestore(&conf->device_lock, flags);
>
> /* In case raid10d snuck in to freeze_array */
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index e057944..b64a545 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -124,17 +124,23 @@ static void print_raid5_conf (raid5_conf_t *conf);
>
> static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh)
> {
> + struct request_queue *q = conf->mddev->queue;
> +
> if (atomic_dec_and_test(&sh->count)) {
> BUG_ON(!list_empty(&sh->lru));
> BUG_ON(atomic_read(&conf->active_stripes)==0);
> if (test_bit(STRIPE_HANDLE, &sh->state)) {
> if (test_bit(STRIPE_DELAYED, &sh->state)) {
> list_add_tail(&sh->lru, &conf->delayed_list);
> - blk_plug_device(conf->mddev->queue);
> + spin_lock(q->queue_lock);
> + blk_plug_device(q);
> + spin_unlock(q->queue_lock);
> } else if (test_bit(STRIPE_BIT_DELAY, &sh->state) &&
> sh->bm_seq - conf->seq_write > 0) {
> list_add_tail(&sh->lru, &conf->bitmap_list);
> - blk_plug_device(conf->mddev->queue);
> + spin_lock(q->queue_lock);
> + blk_plug_device(q);
> + spin_unlock(q->queue_lock);
> } else {
> clear_bit(STRIPE_BIT_DELAY, &sh->state);
> list_add_tail(&sh->lru, &conf->handle_list);
> @@ -3268,10 +3274,12 @@ static void raid5_unplug_device(struct request_queue *q)
>
> spin_lock_irqsave(&conf->device_lock, flags);
>
> + spin_lock(q->queue_lock);
> if (blk_remove_plug(q)) {
> conf->seq_flush++;
> raid5_activate_delayed(conf);
> }
> + spin_unlock(q->queue_lock);
> md_wakeup_thread(mddev->thread);
>
> spin_unlock_irqrestore(&conf->device_lock, flags);
> diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h
> index 812ffa5..000a215 100644
> --- a/include/linux/raid/md_k.h
> +++ b/include/linux/raid/md_k.h
> @@ -240,6 +240,9 @@ struct mddev_s
> struct timer_list safemode_timer;
> atomic_t writes_pending;
> struct request_queue *queue; /* for plugging ... */
> + spinlock_t queue_lock; /* we supply the lock
> + * for blk-core
> + */
>
> atomic_t write_behind; /* outstanding async IO */
> unsigned int max_write_behind; /* 0 = sync */
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
2008-05-01 0:36 ` Neil Brown
@ 2008-05-01 0:50 ` Dan Williams
2008-05-01 2:33 ` Neil Brown
0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-05-01 0:50 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
On Wed, Apr 30, 2008 at 5:36 PM, Neil Brown <neilb@suse.de> wrote:
> On Tuesday April 29, dan.j.williams@intel.com wrote:
> > Now that queue flags are no longer atomic (commit:
> > 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) we must protect calls to
> > blk_remove_plug with spin_lock(q->queue_lock).
>
> Can't we just do
>
> q->queue_lock = &conf->device_lock
>
> and appropriate places in the various ->run functions?
>
> It seems to be that we are doing appropriate locking, we just need to
> convince queue_flag_set / queue_flag_clear that the correct lock is
> locked.
> ??
>
That's much simpler. Hmm... as an additional cleanup should
device_lock then move to mddev_t since it really is a mddev-level
lock? There seem to be more than a few places that cast
mddev->private just to get conf->device_lock. Then all this lock
initialization is centralized.
--
Dan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
2008-05-01 0:50 ` Dan Williams
@ 2008-05-01 2:33 ` Neil Brown
0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2008-05-01 2:33 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-raid
On Wednesday April 30, dan.j.williams@intel.com wrote:
> On Wed, Apr 30, 2008 at 5:36 PM, Neil Brown <neilb@suse.de> wrote:
> > On Tuesday April 29, dan.j.williams@intel.com wrote:
> > > Now that queue flags are no longer atomic (commit:
> > > 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) we must protect calls to
> > > blk_remove_plug with spin_lock(q->queue_lock).
> >
> > Can't we just do
> >
> > q->queue_lock = &conf->device_lock
> >
> > and appropriate places in the various ->run functions?
> >
> > It seems to be that we are doing appropriate locking, we just need to
> > convince queue_flag_set / queue_flag_clear that the correct lock is
> > locked.
> > ??
> >
>
> That's much simpler. Hmm... as an additional cleanup should
> device_lock then move to mddev_t since it really is a mddev-level
> lock? There seem to be more than a few places that cast
> mddev->private just to get conf->device_lock. Then all this lock
> initialization is centralized.
Conversely, there seem to be plenty of places where we access
conf->device_lock and don't have an 'mddev' handy.
They could of course become
conf->mddev->device_lock
but I'm not sure that is an improvement.
I think you would need to actually do the change, then decide if the
benefits outweigh the costs.
It's probably best to find a minimal solution to the current problem
and just add the assignments to q->queue_lock for now. We can then
explore whether there are cleanup opportunities.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events
2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
@ 2008-05-01 4:44 ` Neil Brown
2008-05-01 20:57 ` Dan Williams
0 siblings, 1 reply; 11+ messages in thread
From: Neil Brown @ 2008-05-01 4:44 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-raid
On Tuesday April 29, dan.j.williams@intel.com wrote:
> Also, when userspace writes 'active', clear all mddev->flags to satisfy
> md_write_start (the other bits do not matter to external-metadata arrays).
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Thanks Dan.
There are two things that I don't like about this patch.
1/ I don't think clearing all the flags in array_state_store is really
right. You do that to make sure that MD_CHANGE_DEVS is clear, but
there is no guarantee that it won't be set again before it is
tested in md_write_start.
I think it is best to just get md_write_start (and md_allow_write)
to just test the bits they are really interested in.
2/ having the tests of mddev->external isn't necessary. I really want
"array_state" to get a notification whenever it changes, no matter
what sort of metadata is being used.
Change the patch based on those observations I get:
---------------------------------
Signed-off-by: Neil Brown <neilb@suse.de>
### Diffstat output
./drivers/md/md.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c 2008-04-29 12:27:58.000000000 +1000
+++ ./drivers/md/md.c 2008-05-01 14:35:16.000000000 +1000
@@ -5434,8 +5434,11 @@ void md_write_start(mddev_t *mddev, stru
md_wakeup_thread(mddev->thread);
}
spin_unlock_irq(&mddev->write_lock);
+ sysfs_notify(&mddev->kobj, NULL, "array_state");
}
- wait_event(mddev->sb_wait, mddev->flags==0);
+ wait_event(mddev->sb_wait,
+ !test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
+ !test_bit(MD_CHANGE_PENDING, &mddev->flags));
}
void md_write_end(mddev_t *mddev)
@@ -5470,6 +5473,12 @@ void md_allow_write(mddev_t *mddev)
mddev->safemode = 1;
spin_unlock_irq(&mddev->write_lock);
md_update_sb(mddev, 0);
+
+ sysfs_notify(&mddev->kobj, NULL, "array_state");
+ /* wait for the dirty state to be recorded in the metadata */
+ wait_event(mddev->sb_wait,
+ !test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
+ !test_bit(MD_CHANGE_PENDING, &mddev->flags));
} else
spin_unlock_irq(&mddev->write_lock);
}
----------------------------
Look OK ??
Thanks,
NeilBrown
> ---
>
> drivers/md/md.c | 19 +++++++++++++++----
> 1 files changed, 15 insertions(+), 4 deletions(-)
>
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 6859bcd..ad53035 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -2664,7 +2664,7 @@ array_state_store(mddev_t *mddev, const char *buf, size_t len)
> if (mddev->pers) {
> restart_array(mddev);
> if (mddev->external)
> - clear_bit(MD_CHANGE_CLEAN, &mddev->flags);
> + mddev->flags = 0;
> wake_up(&mddev->sb_wait);
> err = 0;
> } else {
> @@ -5413,9 +5413,14 @@ void md_write_start(mddev_t *mddev, struct bio *bi)
> if (mddev->in_sync) {
> mddev->in_sync = 0;
> set_bit(MD_CHANGE_CLEAN, &mddev->flags);
> + spin_unlock_irq(&mddev->write_lock);
> +
> + /* notify userspace to handle clean->dirty */
> + if (mddev->external)
> + sysfs_notify(&mddev->kobj, NULL, "array_state");
> md_wakeup_thread(mddev->thread);
> - }
> - spin_unlock_irq(&mddev->write_lock);
> + } else
> + spin_unlock_irq(&mddev->write_lock);
> }
> wait_event(mddev->sb_wait, mddev->flags==0);
> }
> @@ -5451,7 +5456,13 @@ void md_allow_write(mddev_t *mddev)
> mddev->safemode == 0)
> mddev->safemode = 1;
> spin_unlock_irq(&mddev->write_lock);
> - md_update_sb(mddev, 0);
> +
> + /* wait for the dirty state to be recorded in the metadata */
> + if (mddev->external) {
> + sysfs_notify(&mddev->kobj, NULL, "array_state");
> + wait_event(mddev->sb_wait, mddev->flags == 0);
> + } else
> + md_update_sb(mddev, 0);
> } else
> spin_unlock_irq(&mddev->write_lock);
> }
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] md: ping userspace on 'stop' events
2008-04-30 1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams
@ 2008-05-01 4:50 ` Neil Brown
0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2008-05-01 4:50 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-raid
On Tuesday April 29, dan.j.williams@intel.com wrote:
> This additional notification to 'array_state' is needed to allow the monitor
> application to learn about stop events via sysfs. The
> sysfs_notify("sync_action") call that comes at the end of do_md_stop() (via
> md_new_event) is insufficient since the 'sync_action' attribute has been
> removed by this point.
>
> (Seems like a sysfs-notify-on-removal patch is a better fix. Currently removal
> updates the event count but does not wake up waiters)
>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
Thanks.
Again, drop the "if (mddev->external)". It isn't needed (I have made
this change).
NeilBrown
>
> drivers/md/md.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index ad53035..606c8ee 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -3693,6 +3693,9 @@ static int do_md_stop(mddev_t * mddev, int mode)
>
> module_put(mddev->pers->owner);
> mddev->pers = NULL;
> + /* tell userspace to handle 'inactive' */
> + if (mddev->external)
> + sysfs_notify(&mddev->kobj, NULL, "array_state");
>
> set_capacity(disk, 0);
> mddev->changed = 1;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events
2008-05-01 4:44 ` Neil Brown
@ 2008-05-01 20:57 ` Dan Williams
2008-05-02 2:10 ` Neil Brown
0 siblings, 1 reply; 11+ messages in thread
From: Dan Williams @ 2008-05-01 20:57 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
On Wed, Apr 30, 2008 at 9:44 PM, Neil Brown <neilb@suse.de> wrote:
> On Tuesday April 29, dan.j.williams@intel.com wrote:
> > Also, when userspace writes 'active', clear all mddev->flags to satisfy
> > md_write_start (the other bits do not matter to external-metadata arrays).
> >
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> Thanks Dan.
>
> There are two things that I don't like about this patch.
>
> 1/ I don't think clearing all the flags in array_state_store is really
> right. You do that to make sure that MD_CHANGE_DEVS is clear, but
> there is no guarantee that it won't be set again before it is
> tested in md_write_start.
> I think it is best to just get md_write_start (and md_allow_write)
> to just test the bits they are really interested in.
>
> 2/ having the tests of mddev->external isn't necessary. I really want
> "array_state" to get a notification whenever it changes, no matter
> what sort of metadata is being used.
>
> Change the patch based on those observations I get:
>
> ---------------------------------
> Signed-off-by: Neil Brown <neilb@suse.de>
>
> ### Diffstat output
> ./drivers/md/md.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff .prev/drivers/md/md.c ./drivers/md/md.c
> --- .prev/drivers/md/md.c 2008-04-29 12:27:58.000000000 +1000
> +++ ./drivers/md/md.c 2008-05-01 14:35:16.000000000 +1000
> @@ -5434,8 +5434,11 @@ void md_write_start(mddev_t *mddev, stru
>
> md_wakeup_thread(mddev->thread);
> }
> spin_unlock_irq(&mddev->write_lock);
>
> + sysfs_notify(&mddev->kobj, NULL, "array_state");
> }
> - wait_event(mddev->sb_wait, mddev->flags==0);
> + wait_event(mddev->sb_wait,
> + !test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
> + !test_bit(MD_CHANGE_PENDING, &mddev->flags));
> }
>
> void md_write_end(mddev_t *mddev)
> @@ -5470,6 +5473,12 @@ void md_allow_write(mddev_t *mddev)
>
> mddev->safemode = 1;
> spin_unlock_irq(&mddev->write_lock);
> md_update_sb(mddev, 0);
> +
>
> + sysfs_notify(&mddev->kobj, NULL, "array_state");
>
> + /* wait for the dirty state to be recorded in the metadata */
> + wait_event(mddev->sb_wait,
> + !test_bit(MD_CHANGE_CLEAN, &mddev->flags) &&
> + !test_bit(MD_CHANGE_PENDING, &mddev->flags));
>
> } else
> spin_unlock_irq(&mddev->write_lock);
> }
> ----------------------------
>
> Look OK ??
>
[ thinking out loud here to make sure we are on the same page ]
I had to convince myself that it is still holding off writes while
MD_CHANGE_DEVS is set. And yes, MD_CHANGE_PENDING is not cleared
until after MD_CHANGE_DEVS is handled. So, the mddev->flags==0 test
was indeed more restrictive than it needed to be. Adding the
unqualified wait_event() to md_allow_write() also makes sense since it
is needed in the 'external' case and should be a nop in the
'!external' case since those bits are cleared in md_update_sb().
Simple test confirms this is working as expected.
--
Dan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events
2008-05-01 20:57 ` Dan Williams
@ 2008-05-02 2:10 ` Neil Brown
0 siblings, 0 replies; 11+ messages in thread
From: Neil Brown @ 2008-05-02 2:10 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-raid
On Thursday May 1, dan.j.williams@intel.com wrote:
>
> [ thinking out loud here to make sure we are on the same page ]
>
> I had to convince myself that it is still holding off writes while
> MD_CHANGE_DEVS is set. And yes, MD_CHANGE_PENDING is not cleared
> until after MD_CHANGE_DEVS is handled. So, the mddev->flags==0 test
> was indeed more restrictive than it needed to be. Adding the
> unqualified wait_event() to md_allow_write() also makes sense since it
> is needed in the 'external' case and should be a nop in the
> '!external' case since those bits are cleared in md_update_sb().
>
> Simple test confirms this is working as expected.
Good, thanks.
NeilBrown
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-05-02 2:10 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-30 1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams
2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
2008-05-01 4:44 ` Neil Brown
2008-05-01 20:57 ` Dan Williams
2008-05-02 2:10 ` Neil Brown
2008-04-30 1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams
2008-05-01 4:50 ` Neil Brown
2008-04-30 1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams
2008-05-01 0:36 ` Neil Brown
2008-05-01 0:50 ` Dan Williams
2008-05-01 2:33 ` Neil Brown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).