* [PATCH 0/3] A few more md patches for 2.6.26
@ 2008-04-30 1:19 Dan Williams
2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Dan Williams @ 2008-04-30 1:19 UTC (permalink / raw)
To: neilb; +Cc: linux-raid
Hi Neil,
The first two patches add some necessary sysfs_notify calls for external
metadata arrays. They replace
md-ping-userspace-on-ext-metadata-events.patch which, as you noted, did
these notifications globally in md_check_recovery() rather than at the
local site where the state changes.
While testing these I ran into:
WARNING: at include/linux/blkdev.h:443 queue_flag_clear+0x2c/0x4d()
...due to the recent "block: make queue flags non-atomic" (commit:
75ad23bc0fcb4f992a5d06982bf0857ab1738e9e). I believe patch 3/3 is the
intended fix.
---
Dan Williams (3):
md: add new / required locking for calls to blk_remove_plug and blk_plug_queue
md: ping userspace on 'stop' events
md: ping userspace on 'write-pending' events
drivers/md/md.c | 25 +++++++++++++++++++++----
drivers/md/raid1.c | 10 ++++++++--
drivers/md/raid10.c | 10 ++++++++--
drivers/md/raid5.c | 12 ++++++++++--
include/linux/raid/md_k.h | 3 +++
5 files changed, 50 insertions(+), 10 deletions(-)
--
Regards,
Dan
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/3] md: ping userspace on 'write-pending' events 2008-04-30 1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams @ 2008-04-30 1:19 ` Dan Williams 2008-05-01 4:44 ` Neil Brown 2008-04-30 1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams 2008-04-30 1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams 2 siblings, 1 reply; 11+ messages in thread From: Dan Williams @ 2008-04-30 1:19 UTC (permalink / raw) To: neilb; +Cc: linux-raid Also, when userspace writes 'active', clear all mddev->flags to satisfy md_write_start (the other bits do not matter to external-metadata arrays). Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/md/md.c | 19 +++++++++++++++---- 1 files changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 6859bcd..ad53035 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2664,7 +2664,7 @@ array_state_store(mddev_t *mddev, const char *buf, size_t len) if (mddev->pers) { restart_array(mddev); if (mddev->external) - clear_bit(MD_CHANGE_CLEAN, &mddev->flags); + mddev->flags = 0; wake_up(&mddev->sb_wait); err = 0; } else { @@ -5413,9 +5413,14 @@ void md_write_start(mddev_t *mddev, struct bio *bi) if (mddev->in_sync) { mddev->in_sync = 0; set_bit(MD_CHANGE_CLEAN, &mddev->flags); + spin_unlock_irq(&mddev->write_lock); + + /* notify userspace to handle clean->dirty */ + if (mddev->external) + sysfs_notify(&mddev->kobj, NULL, "array_state"); md_wakeup_thread(mddev->thread); - } - spin_unlock_irq(&mddev->write_lock); + } else + spin_unlock_irq(&mddev->write_lock); } wait_event(mddev->sb_wait, mddev->flags==0); } @@ -5451,7 +5456,13 @@ void md_allow_write(mddev_t *mddev) mddev->safemode == 0) mddev->safemode = 1; spin_unlock_irq(&mddev->write_lock); - md_update_sb(mddev, 0); + + /* wait for the dirty state to be recorded in the metadata */ + if (mddev->external) { + sysfs_notify(&mddev->kobj, NULL, "array_state"); + wait_event(mddev->sb_wait, mddev->flags == 0); + } else + md_update_sb(mddev, 0); } else spin_unlock_irq(&mddev->write_lock); } ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events 2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams @ 2008-05-01 4:44 ` Neil Brown 2008-05-01 20:57 ` Dan Williams 0 siblings, 1 reply; 11+ messages in thread From: Neil Brown @ 2008-05-01 4:44 UTC (permalink / raw) To: Dan Williams; +Cc: linux-raid On Tuesday April 29, dan.j.williams@intel.com wrote: > Also, when userspace writes 'active', clear all mddev->flags to satisfy > md_write_start (the other bits do not matter to external-metadata arrays). > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Thanks Dan. There are two things that I don't like about this patch. 1/ I don't think clearing all the flags in array_state_store is really right. You do that to make sure that MD_CHANGE_DEVS is clear, but there is no guarantee that it won't be set again before it is tested in md_write_start. I think it is best to just get md_write_start (and md_allow_write) to just test the bits they are really interested in. 2/ having the tests of mddev->external isn't necessary. I really want "array_state" to get a notification whenever it changes, no matter what sort of metadata is being used. Change the patch based on those observations I get: --------------------------------- Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./drivers/md/md.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff .prev/drivers/md/md.c ./drivers/md/md.c --- .prev/drivers/md/md.c 2008-04-29 12:27:58.000000000 +1000 +++ ./drivers/md/md.c 2008-05-01 14:35:16.000000000 +1000 @@ -5434,8 +5434,11 @@ void md_write_start(mddev_t *mddev, stru md_wakeup_thread(mddev->thread); } spin_unlock_irq(&mddev->write_lock); + sysfs_notify(&mddev->kobj, NULL, "array_state"); } - wait_event(mddev->sb_wait, mddev->flags==0); + wait_event(mddev->sb_wait, + !test_bit(MD_CHANGE_CLEAN, &mddev->flags) && + !test_bit(MD_CHANGE_PENDING, &mddev->flags)); } void md_write_end(mddev_t *mddev) @@ -5470,6 +5473,12 @@ void md_allow_write(mddev_t *mddev) mddev->safemode = 1; spin_unlock_irq(&mddev->write_lock); md_update_sb(mddev, 0); + + sysfs_notify(&mddev->kobj, NULL, "array_state"); + /* wait for the dirty state to be recorded in the metadata */ + wait_event(mddev->sb_wait, + !test_bit(MD_CHANGE_CLEAN, &mddev->flags) && + !test_bit(MD_CHANGE_PENDING, &mddev->flags)); } else spin_unlock_irq(&mddev->write_lock); } ---------------------------- Look OK ?? Thanks, NeilBrown > --- > > drivers/md/md.c | 19 +++++++++++++++---- > 1 files changed, 15 insertions(+), 4 deletions(-) > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 6859bcd..ad53035 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -2664,7 +2664,7 @@ array_state_store(mddev_t *mddev, const char *buf, size_t len) > if (mddev->pers) { > restart_array(mddev); > if (mddev->external) > - clear_bit(MD_CHANGE_CLEAN, &mddev->flags); > + mddev->flags = 0; > wake_up(&mddev->sb_wait); > err = 0; > } else { > @@ -5413,9 +5413,14 @@ void md_write_start(mddev_t *mddev, struct bio *bi) > if (mddev->in_sync) { > mddev->in_sync = 0; > set_bit(MD_CHANGE_CLEAN, &mddev->flags); > + spin_unlock_irq(&mddev->write_lock); > + > + /* notify userspace to handle clean->dirty */ > + if (mddev->external) > + sysfs_notify(&mddev->kobj, NULL, "array_state"); > md_wakeup_thread(mddev->thread); > - } > - spin_unlock_irq(&mddev->write_lock); > + } else > + spin_unlock_irq(&mddev->write_lock); > } > wait_event(mddev->sb_wait, mddev->flags==0); > } > @@ -5451,7 +5456,13 @@ void md_allow_write(mddev_t *mddev) > mddev->safemode == 0) > mddev->safemode = 1; > spin_unlock_irq(&mddev->write_lock); > - md_update_sb(mddev, 0); > + > + /* wait for the dirty state to be recorded in the metadata */ > + if (mddev->external) { > + sysfs_notify(&mddev->kobj, NULL, "array_state"); > + wait_event(mddev->sb_wait, mddev->flags == 0); > + } else > + md_update_sb(mddev, 0); > } else > spin_unlock_irq(&mddev->write_lock); > } ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events 2008-05-01 4:44 ` Neil Brown @ 2008-05-01 20:57 ` Dan Williams 2008-05-02 2:10 ` Neil Brown 0 siblings, 1 reply; 11+ messages in thread From: Dan Williams @ 2008-05-01 20:57 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid On Wed, Apr 30, 2008 at 9:44 PM, Neil Brown <neilb@suse.de> wrote: > On Tuesday April 29, dan.j.williams@intel.com wrote: > > Also, when userspace writes 'active', clear all mddev->flags to satisfy > > md_write_start (the other bits do not matter to external-metadata arrays). > > > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > > Thanks Dan. > > There are two things that I don't like about this patch. > > 1/ I don't think clearing all the flags in array_state_store is really > right. You do that to make sure that MD_CHANGE_DEVS is clear, but > there is no guarantee that it won't be set again before it is > tested in md_write_start. > I think it is best to just get md_write_start (and md_allow_write) > to just test the bits they are really interested in. > > 2/ having the tests of mddev->external isn't necessary. I really want > "array_state" to get a notification whenever it changes, no matter > what sort of metadata is being used. > > Change the patch based on those observations I get: > > --------------------------------- > Signed-off-by: Neil Brown <neilb@suse.de> > > ### Diffstat output > ./drivers/md/md.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff .prev/drivers/md/md.c ./drivers/md/md.c > --- .prev/drivers/md/md.c 2008-04-29 12:27:58.000000000 +1000 > +++ ./drivers/md/md.c 2008-05-01 14:35:16.000000000 +1000 > @@ -5434,8 +5434,11 @@ void md_write_start(mddev_t *mddev, stru > > md_wakeup_thread(mddev->thread); > } > spin_unlock_irq(&mddev->write_lock); > > + sysfs_notify(&mddev->kobj, NULL, "array_state"); > } > - wait_event(mddev->sb_wait, mddev->flags==0); > + wait_event(mddev->sb_wait, > + !test_bit(MD_CHANGE_CLEAN, &mddev->flags) && > + !test_bit(MD_CHANGE_PENDING, &mddev->flags)); > } > > void md_write_end(mddev_t *mddev) > @@ -5470,6 +5473,12 @@ void md_allow_write(mddev_t *mddev) > > mddev->safemode = 1; > spin_unlock_irq(&mddev->write_lock); > md_update_sb(mddev, 0); > + > > + sysfs_notify(&mddev->kobj, NULL, "array_state"); > > + /* wait for the dirty state to be recorded in the metadata */ > + wait_event(mddev->sb_wait, > + !test_bit(MD_CHANGE_CLEAN, &mddev->flags) && > + !test_bit(MD_CHANGE_PENDING, &mddev->flags)); > > } else > spin_unlock_irq(&mddev->write_lock); > } > ---------------------------- > > Look OK ?? > [ thinking out loud here to make sure we are on the same page ] I had to convince myself that it is still holding off writes while MD_CHANGE_DEVS is set. And yes, MD_CHANGE_PENDING is not cleared until after MD_CHANGE_DEVS is handled. So, the mddev->flags==0 test was indeed more restrictive than it needed to be. Adding the unqualified wait_event() to md_allow_write() also makes sense since it is needed in the 'external' case and should be a nop in the '!external' case since those bits are cleared in md_update_sb(). Simple test confirms this is working as expected. -- Dan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] md: ping userspace on 'write-pending' events 2008-05-01 20:57 ` Dan Williams @ 2008-05-02 2:10 ` Neil Brown 0 siblings, 0 replies; 11+ messages in thread From: Neil Brown @ 2008-05-02 2:10 UTC (permalink / raw) To: Dan Williams; +Cc: linux-raid On Thursday May 1, dan.j.williams@intel.com wrote: > > [ thinking out loud here to make sure we are on the same page ] > > I had to convince myself that it is still holding off writes while > MD_CHANGE_DEVS is set. And yes, MD_CHANGE_PENDING is not cleared > until after MD_CHANGE_DEVS is handled. So, the mddev->flags==0 test > was indeed more restrictive than it needed to be. Adding the > unqualified wait_event() to md_allow_write() also makes sense since it > is needed in the 'external' case and should be a nop in the > '!external' case since those bits are cleared in md_update_sb(). > > Simple test confirms this is working as expected. Good, thanks. NeilBrown ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/3] md: ping userspace on 'stop' events 2008-04-30 1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams 2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams @ 2008-04-30 1:19 ` Dan Williams 2008-05-01 4:50 ` Neil Brown 2008-04-30 1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams 2 siblings, 1 reply; 11+ messages in thread From: Dan Williams @ 2008-04-30 1:19 UTC (permalink / raw) To: neilb; +Cc: linux-raid This additional notification to 'array_state' is needed to allow the monitor application to learn about stop events via sysfs. The sysfs_notify("sync_action") call that comes at the end of do_md_stop() (via md_new_event) is insufficient since the 'sync_action' attribute has been removed by this point. (Seems like a sysfs-notify-on-removal patch is a better fix. Currently removal updates the event count but does not wake up waiters) Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/md/md.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index ad53035..606c8ee 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -3693,6 +3693,9 @@ static int do_md_stop(mddev_t * mddev, int mode) module_put(mddev->pers->owner); mddev->pers = NULL; + /* tell userspace to handle 'inactive' */ + if (mddev->external) + sysfs_notify(&mddev->kobj, NULL, "array_state"); set_capacity(disk, 0); mddev->changed = 1; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] md: ping userspace on 'stop' events 2008-04-30 1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams @ 2008-05-01 4:50 ` Neil Brown 0 siblings, 0 replies; 11+ messages in thread From: Neil Brown @ 2008-05-01 4:50 UTC (permalink / raw) To: Dan Williams; +Cc: linux-raid On Tuesday April 29, dan.j.williams@intel.com wrote: > This additional notification to 'array_state' is needed to allow the monitor > application to learn about stop events via sysfs. The > sysfs_notify("sync_action") call that comes at the end of do_md_stop() (via > md_new_event) is insufficient since the 'sync_action' attribute has been > removed by this point. > > (Seems like a sysfs-notify-on-removal patch is a better fix. Currently removal > updates the event count but does not wake up waiters) > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- Thanks. Again, drop the "if (mddev->external)". It isn't needed (I have made this change). NeilBrown > > drivers/md/md.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index ad53035..606c8ee 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -3693,6 +3693,9 @@ static int do_md_stop(mddev_t * mddev, int mode) > > module_put(mddev->pers->owner); > mddev->pers = NULL; > + /* tell userspace to handle 'inactive' */ > + if (mddev->external) > + sysfs_notify(&mddev->kobj, NULL, "array_state"); > > set_capacity(disk, 0); > mddev->changed = 1; ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue 2008-04-30 1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams 2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams 2008-04-30 1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams @ 2008-04-30 1:19 ` Dan Williams 2008-05-01 0:36 ` Neil Brown 2 siblings, 1 reply; 11+ messages in thread From: Dan Williams @ 2008-04-30 1:19 UTC (permalink / raw) To: neilb; +Cc: linux-raid Now that queue flags are no longer atomic (commit: 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) we must protect calls to blk_remove_plug with spin_lock(q->queue_lock). Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- drivers/md/md.c | 3 +++ drivers/md/raid1.c | 10 ++++++++-- drivers/md/raid10.c | 10 ++++++++-- drivers/md/raid5.c | 12 ++++++++++-- include/linux/raid/md_k.h | 3 +++ 5 files changed, 32 insertions(+), 6 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 606c8ee..e389978 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -283,6 +283,9 @@ static mddev_t * mddev_find(dev_t unit) kfree(new); return NULL; } + spin_lock_init(&new->queue_lock); + new->queue->queue_lock = &new->queue_lock; + /* Can be unlocked because the queue is new: no concurrency */ queue_flag_set_unlocked(QUEUE_FLAG_CLUSTER, new->queue); diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 6778b7c..79f19b1 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -603,9 +603,13 @@ static int flush_pending_writes(conf_t *conf) spin_lock_irq(&conf->device_lock); if (conf->pending_bio_list.head) { + struct request_queue *q = conf->mddev->queue; struct bio *bio; + bio = bio_list_get(&conf->pending_bio_list); - blk_remove_plug(conf->mddev->queue); + spin_lock(q->queue_lock); + blk_remove_plug(q); + spin_unlock(q->queue_lock); spin_unlock_irq(&conf->device_lock); /* flush any pending bitmap writes to * disk before proceeding w/ I/O */ @@ -964,7 +968,9 @@ static int make_request(struct request_queue *q, struct bio * bio) bio_list_merge(&conf->pending_bio_list, &bl); bio_list_init(&bl); - blk_plug_device(mddev->queue); + spin_lock(q->queue_lock); + blk_plug_device(q); + spin_unlock(q->queue_lock); spin_unlock_irqrestore(&conf->device_lock, flags); /* In case raid1d snuck into freeze_array */ diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 5938fa9..3ecd437 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -646,9 +646,13 @@ static int flush_pending_writes(conf_t *conf) spin_lock_irq(&conf->device_lock); if (conf->pending_bio_list.head) { + struct request_queue *q = conf->mddev->queue; struct bio *bio; + bio = bio_list_get(&conf->pending_bio_list); - blk_remove_plug(conf->mddev->queue); + spin_lock(q->queue_lock); + blk_remove_plug(q); + spin_unlock(q->queue_lock); spin_unlock_irq(&conf->device_lock); /* flush any pending bitmap writes to disk * before proceeding w/ I/O */ @@ -955,7 +959,9 @@ static int make_request(struct request_queue *q, struct bio * bio) bitmap_startwrite(mddev->bitmap, bio->bi_sector, r10_bio->sectors, 0); spin_lock_irqsave(&conf->device_lock, flags); bio_list_merge(&conf->pending_bio_list, &bl); - blk_plug_device(mddev->queue); + spin_lock(q->queue_lock); + blk_plug_device(q); + spin_unlock(q->queue_lock); spin_unlock_irqrestore(&conf->device_lock, flags); /* In case raid10d snuck in to freeze_array */ diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index e057944..b64a545 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -124,17 +124,23 @@ static void print_raid5_conf (raid5_conf_t *conf); static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh) { + struct request_queue *q = conf->mddev->queue; + if (atomic_dec_and_test(&sh->count)) { BUG_ON(!list_empty(&sh->lru)); BUG_ON(atomic_read(&conf->active_stripes)==0); if (test_bit(STRIPE_HANDLE, &sh->state)) { if (test_bit(STRIPE_DELAYED, &sh->state)) { list_add_tail(&sh->lru, &conf->delayed_list); - blk_plug_device(conf->mddev->queue); + spin_lock(q->queue_lock); + blk_plug_device(q); + spin_unlock(q->queue_lock); } else if (test_bit(STRIPE_BIT_DELAY, &sh->state) && sh->bm_seq - conf->seq_write > 0) { list_add_tail(&sh->lru, &conf->bitmap_list); - blk_plug_device(conf->mddev->queue); + spin_lock(q->queue_lock); + blk_plug_device(q); + spin_unlock(q->queue_lock); } else { clear_bit(STRIPE_BIT_DELAY, &sh->state); list_add_tail(&sh->lru, &conf->handle_list); @@ -3268,10 +3274,12 @@ static void raid5_unplug_device(struct request_queue *q) spin_lock_irqsave(&conf->device_lock, flags); + spin_lock(q->queue_lock); if (blk_remove_plug(q)) { conf->seq_flush++; raid5_activate_delayed(conf); } + spin_unlock(q->queue_lock); md_wakeup_thread(mddev->thread); spin_unlock_irqrestore(&conf->device_lock, flags); diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h index 812ffa5..000a215 100644 --- a/include/linux/raid/md_k.h +++ b/include/linux/raid/md_k.h @@ -240,6 +240,9 @@ struct mddev_s struct timer_list safemode_timer; atomic_t writes_pending; struct request_queue *queue; /* for plugging ... */ + spinlock_t queue_lock; /* we supply the lock + * for blk-core + */ atomic_t write_behind; /* outstanding async IO */ unsigned int max_write_behind; /* 0 = sync */ ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue 2008-04-30 1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams @ 2008-05-01 0:36 ` Neil Brown 2008-05-01 0:50 ` Dan Williams 0 siblings, 1 reply; 11+ messages in thread From: Neil Brown @ 2008-05-01 0:36 UTC (permalink / raw) To: Dan Williams; +Cc: linux-raid On Tuesday April 29, dan.j.williams@intel.com wrote: > Now that queue flags are no longer atomic (commit: > 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) we must protect calls to > blk_remove_plug with spin_lock(q->queue_lock). Can't we just do q->queue_lock = &conf->device_lock and appropriate places in the various ->run functions? It seems to be that we are doing appropriate locking, we just need to convince queue_flag_set / queue_flag_clear that the correct lock is locked. ?? NeilBrown > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > > drivers/md/md.c | 3 +++ > drivers/md/raid1.c | 10 ++++++++-- > drivers/md/raid10.c | 10 ++++++++-- > drivers/md/raid5.c | 12 ++++++++++-- > include/linux/raid/md_k.h | 3 +++ > 5 files changed, 32 insertions(+), 6 deletions(-) > > > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 606c8ee..e389978 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -283,6 +283,9 @@ static mddev_t * mddev_find(dev_t unit) > kfree(new); > return NULL; > } > + spin_lock_init(&new->queue_lock); > + new->queue->queue_lock = &new->queue_lock; > + > /* Can be unlocked because the queue is new: no concurrency */ > queue_flag_set_unlocked(QUEUE_FLAG_CLUSTER, new->queue); > > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index 6778b7c..79f19b1 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -603,9 +603,13 @@ static int flush_pending_writes(conf_t *conf) > spin_lock_irq(&conf->device_lock); > > if (conf->pending_bio_list.head) { > + struct request_queue *q = conf->mddev->queue; > struct bio *bio; > + > bio = bio_list_get(&conf->pending_bio_list); > - blk_remove_plug(conf->mddev->queue); > + spin_lock(q->queue_lock); > + blk_remove_plug(q); > + spin_unlock(q->queue_lock); > spin_unlock_irq(&conf->device_lock); > /* flush any pending bitmap writes to > * disk before proceeding w/ I/O */ > @@ -964,7 +968,9 @@ static int make_request(struct request_queue *q, struct bio * bio) > bio_list_merge(&conf->pending_bio_list, &bl); > bio_list_init(&bl); > > - blk_plug_device(mddev->queue); > + spin_lock(q->queue_lock); > + blk_plug_device(q); > + spin_unlock(q->queue_lock); > spin_unlock_irqrestore(&conf->device_lock, flags); > > /* In case raid1d snuck into freeze_array */ > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 5938fa9..3ecd437 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -646,9 +646,13 @@ static int flush_pending_writes(conf_t *conf) > spin_lock_irq(&conf->device_lock); > > if (conf->pending_bio_list.head) { > + struct request_queue *q = conf->mddev->queue; > struct bio *bio; > + > bio = bio_list_get(&conf->pending_bio_list); > - blk_remove_plug(conf->mddev->queue); > + spin_lock(q->queue_lock); > + blk_remove_plug(q); > + spin_unlock(q->queue_lock); > spin_unlock_irq(&conf->device_lock); > /* flush any pending bitmap writes to disk > * before proceeding w/ I/O */ > @@ -955,7 +959,9 @@ static int make_request(struct request_queue *q, struct bio * bio) > bitmap_startwrite(mddev->bitmap, bio->bi_sector, r10_bio->sectors, 0); > spin_lock_irqsave(&conf->device_lock, flags); > bio_list_merge(&conf->pending_bio_list, &bl); > - blk_plug_device(mddev->queue); > + spin_lock(q->queue_lock); > + blk_plug_device(q); > + spin_unlock(q->queue_lock); > spin_unlock_irqrestore(&conf->device_lock, flags); > > /* In case raid10d snuck in to freeze_array */ > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index e057944..b64a545 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -124,17 +124,23 @@ static void print_raid5_conf (raid5_conf_t *conf); > > static void __release_stripe(raid5_conf_t *conf, struct stripe_head *sh) > { > + struct request_queue *q = conf->mddev->queue; > + > if (atomic_dec_and_test(&sh->count)) { > BUG_ON(!list_empty(&sh->lru)); > BUG_ON(atomic_read(&conf->active_stripes)==0); > if (test_bit(STRIPE_HANDLE, &sh->state)) { > if (test_bit(STRIPE_DELAYED, &sh->state)) { > list_add_tail(&sh->lru, &conf->delayed_list); > - blk_plug_device(conf->mddev->queue); > + spin_lock(q->queue_lock); > + blk_plug_device(q); > + spin_unlock(q->queue_lock); > } else if (test_bit(STRIPE_BIT_DELAY, &sh->state) && > sh->bm_seq - conf->seq_write > 0) { > list_add_tail(&sh->lru, &conf->bitmap_list); > - blk_plug_device(conf->mddev->queue); > + spin_lock(q->queue_lock); > + blk_plug_device(q); > + spin_unlock(q->queue_lock); > } else { > clear_bit(STRIPE_BIT_DELAY, &sh->state); > list_add_tail(&sh->lru, &conf->handle_list); > @@ -3268,10 +3274,12 @@ static void raid5_unplug_device(struct request_queue *q) > > spin_lock_irqsave(&conf->device_lock, flags); > > + spin_lock(q->queue_lock); > if (blk_remove_plug(q)) { > conf->seq_flush++; > raid5_activate_delayed(conf); > } > + spin_unlock(q->queue_lock); > md_wakeup_thread(mddev->thread); > > spin_unlock_irqrestore(&conf->device_lock, flags); > diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h > index 812ffa5..000a215 100644 > --- a/include/linux/raid/md_k.h > +++ b/include/linux/raid/md_k.h > @@ -240,6 +240,9 @@ struct mddev_s > struct timer_list safemode_timer; > atomic_t writes_pending; > struct request_queue *queue; /* for plugging ... */ > + spinlock_t queue_lock; /* we supply the lock > + * for blk-core > + */ > > atomic_t write_behind; /* outstanding async IO */ > unsigned int max_write_behind; /* 0 = sync */ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue 2008-05-01 0:36 ` Neil Brown @ 2008-05-01 0:50 ` Dan Williams 2008-05-01 2:33 ` Neil Brown 0 siblings, 1 reply; 11+ messages in thread From: Dan Williams @ 2008-05-01 0:50 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid On Wed, Apr 30, 2008 at 5:36 PM, Neil Brown <neilb@suse.de> wrote: > On Tuesday April 29, dan.j.williams@intel.com wrote: > > Now that queue flags are no longer atomic (commit: > > 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) we must protect calls to > > blk_remove_plug with spin_lock(q->queue_lock). > > Can't we just do > > q->queue_lock = &conf->device_lock > > and appropriate places in the various ->run functions? > > It seems to be that we are doing appropriate locking, we just need to > convince queue_flag_set / queue_flag_clear that the correct lock is > locked. > ?? > That's much simpler. Hmm... as an additional cleanup should device_lock then move to mddev_t since it really is a mddev-level lock? There seem to be more than a few places that cast mddev->private just to get conf->device_lock. Then all this lock initialization is centralized. -- Dan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue 2008-05-01 0:50 ` Dan Williams @ 2008-05-01 2:33 ` Neil Brown 0 siblings, 0 replies; 11+ messages in thread From: Neil Brown @ 2008-05-01 2:33 UTC (permalink / raw) To: Dan Williams; +Cc: linux-raid On Wednesday April 30, dan.j.williams@intel.com wrote: > On Wed, Apr 30, 2008 at 5:36 PM, Neil Brown <neilb@suse.de> wrote: > > On Tuesday April 29, dan.j.williams@intel.com wrote: > > > Now that queue flags are no longer atomic (commit: > > > 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) we must protect calls to > > > blk_remove_plug with spin_lock(q->queue_lock). > > > > Can't we just do > > > > q->queue_lock = &conf->device_lock > > > > and appropriate places in the various ->run functions? > > > > It seems to be that we are doing appropriate locking, we just need to > > convince queue_flag_set / queue_flag_clear that the correct lock is > > locked. > > ?? > > > > That's much simpler. Hmm... as an additional cleanup should > device_lock then move to mddev_t since it really is a mddev-level > lock? There seem to be more than a few places that cast > mddev->private just to get conf->device_lock. Then all this lock > initialization is centralized. Conversely, there seem to be plenty of places where we access conf->device_lock and don't have an 'mddev' handy. They could of course become conf->mddev->device_lock but I'm not sure that is an improvement. I think you would need to actually do the change, then decide if the benefits outweigh the costs. It's probably best to find a minimal solution to the current problem and just add the assignments to q->queue_lock for now. We can then explore whether there are cleanup opportunities. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-05-02 2:10 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-04-30 1:19 [PATCH 0/3] A few more md patches for 2.6.26 Dan Williams 2008-04-30 1:19 ` [PATCH 1/3] md: ping userspace on 'write-pending' events Dan Williams 2008-05-01 4:44 ` Neil Brown 2008-05-01 20:57 ` Dan Williams 2008-05-02 2:10 ` Neil Brown 2008-04-30 1:19 ` [PATCH 2/3] md: ping userspace on 'stop' events Dan Williams 2008-05-01 4:50 ` Neil Brown 2008-04-30 1:19 ` [PATCH 3/3] md: add new / required locking for calls to blk_remove_plug and blk_plug_queue Dan Williams 2008-05-01 0:36 ` Neil Brown 2008-05-01 0:50 ` Dan Williams 2008-05-01 2:33 ` Neil Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).