* [patch 1/2] raid6_end_write_request() spinlock fix @ 2006-04-25 3:35 Coywolf Qi Hunt 2006-04-25 5:13 ` Neil Brown 0 siblings, 1 reply; 5+ messages in thread From: Coywolf Qi Hunt @ 2006-04-25 3:35 UTC (permalink / raw) To: akpm; +Cc: neilb, linux-kernel, linux-raid Hello, Reduce the raid6_end_write_request() spinlock window. Signed-off-by: Coywolf Qi Hunt <qiyong@fc-cn.com> --- diff --git a/drivers/md/raid6main.c b/drivers/md/raid6main.c index bc69355..820536e 100644 --- a/drivers/md/raid6main.c +++ b/drivers/md/raid6main.c @@ -468,7 +468,6 @@ static int raid6_end_write_request (stru struct stripe_head *sh = bi->bi_private; raid6_conf_t *conf = sh->raid_conf; int disks = conf->raid_disks, i; - unsigned long flags; int uptodate = test_bit(BIO_UPTODATE, &bi->bi_flags); if (bi->bi_size) @@ -486,16 +485,14 @@ static int raid6_end_write_request (stru return 0; } - spin_lock_irqsave(&conf->device_lock, flags); if (!uptodate) md_error(conf->mddev, conf->disks[i].rdev); rdev_dec_pending(conf->disks[i].rdev, conf->mddev); - clear_bit(R5_LOCKED, &sh->dev[i].flags); set_bit(STRIPE_HANDLE, &sh->state); - __release_stripe(conf, sh); - spin_unlock_irqrestore(&conf->device_lock, flags); + release_stripe(sh); + return 0; } -- Coywolf Qi Hunt ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [patch 1/2] raid6_end_write_request() spinlock fix 2006-04-25 3:35 [patch 1/2] raid6_end_write_request() spinlock fix Coywolf Qi Hunt @ 2006-04-25 5:13 ` Neil Brown 2006-04-25 6:43 ` Coywolf Qi Hunt 0 siblings, 1 reply; 5+ messages in thread From: Neil Brown @ 2006-04-25 5:13 UTC (permalink / raw) To: Coywolf Qi Hunt; +Cc: akpm, linux-kernel, linux-raid On Tuesday April 25, qiyong@fc-cn.com wrote: > Hello, > > Reduce the raid6_end_write_request() spinlock window. Andrew: please don't include these in -mm. This one and the corresponding raid5 are wrong, and I'm not sure yet the unplug_device changes. In this case, the call to md_error, which in turn calls "error" in raid6main.c, requires the lock to be held as it contains: if (!test_bit(Faulty, &rdev->flags)) { mddev->sb_dirty = 1; if (test_bit(In_sync, &rdev->flags)) { conf->working_disks--; mddev->degraded++; conf->failed_disks++; clear_bit(In_sync, &rdev->flags); /* * if recovery was running, make sure it aborts. */ set_bit(MD_RECOVERY_ERR, &mddev->recovery); } set_bit(Faulty, &rdev->flags); which is fairly clearly not safe without some locking. Coywolf: As I think I have already said, I appreciate your review of the md/raid code and your attempts to improve it - I'm sure there is plenty of room to make improvements. However posting patches with minimal commentary on code that you don't fully understand is not the best way to work with the community. If you see something that you think is wrong, it is much better to ask why it is the way it is, explain why you think it isn't right, and quite possibly include an example patch. Then we can discuss the issue and find the best solution. So please feel free to post further patches, but please include more commentary, and don't assume you understand something that you don't really. Thanks, NeilBrown > > Signed-off-by: Coywolf Qi Hunt <qiyong@fc-cn.com> > --- > > diff --git a/drivers/md/raid6main.c b/drivers/md/raid6main.c > index bc69355..820536e 100644 > --- a/drivers/md/raid6main.c > +++ b/drivers/md/raid6main.c > @@ -468,7 +468,6 @@ static int raid6_end_write_request (stru > struct stripe_head *sh = bi->bi_private; > raid6_conf_t *conf = sh->raid_conf; > int disks = conf->raid_disks, i; > - unsigned long flags; > int uptodate = test_bit(BIO_UPTODATE, &bi->bi_flags); > > if (bi->bi_size) > @@ -486,16 +485,14 @@ static int raid6_end_write_request (stru > return 0; > } > > - spin_lock_irqsave(&conf->device_lock, flags); > if (!uptodate) > md_error(conf->mddev, conf->disks[i].rdev); > > rdev_dec_pending(conf->disks[i].rdev, conf->mddev); > - > clear_bit(R5_LOCKED, &sh->dev[i].flags); > set_bit(STRIPE_HANDLE, &sh->state); > - __release_stripe(conf, sh); > - spin_unlock_irqrestore(&conf->device_lock, flags); > + release_stripe(sh); > + > return 0; > } > > > -- > Coywolf Qi Hunt > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch 1/2] raid6_end_write_request() spinlock fix 2006-04-25 5:13 ` Neil Brown @ 2006-04-25 6:43 ` Coywolf Qi Hunt 2006-04-25 6:50 ` Neil Brown 0 siblings, 1 reply; 5+ messages in thread From: Coywolf Qi Hunt @ 2006-04-25 6:43 UTC (permalink / raw) To: Neil Brown; +Cc: akpm, linux-kernel, linux-raid On Tue, Apr 25, 2006 at 03:13:49PM +1000, Neil Brown wrote: > On Tuesday April 25, qiyong@fc-cn.com wrote: > > Hello, > > > > Reduce the raid6_end_write_request() spinlock window. > > Andrew: please don't include these in -mm. This one and the > corresponding raid5 are wrong, and I'm not sure yet the unplug_device > changes. I am sure with the unplug_device. Just look follow the path... > > In this case, the call to md_error, which in turn calls "error" in > raid6main.c, requires the lock to be held as it contains: > if (!test_bit(Faulty, &rdev->flags)) { > mddev->sb_dirty = 1; > if (test_bit(In_sync, &rdev->flags)) { > conf->working_disks--; > mddev->degraded++; > conf->failed_disks++; > clear_bit(In_sync, &rdev->flags); > /* > * if recovery was running, make sure it aborts. > */ > set_bit(MD_RECOVERY_ERR, &mddev->recovery); > } > set_bit(Faulty, &rdev->flags); > > which is fairly clearly not safe without some locking. Yes. Let's fix the error(). In any case, the current code is broken. (see raid5/6_end_read_request) Comments? Thanks. Signed-off-by: Coywolf Qi Hunt <qiyong@fc-cn.com> --- diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 9c24377..192de19 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -638,7 +638,7 @@ static void error(mddev_t *mddev, mdk_rd raid5_conf_t *conf = (raid5_conf_t *) mddev->private; PRINTK("raid5: error called\n"); - if (!test_bit(Faulty, &rdev->flags)) { + if (!test_and_set_bit(Faulty, &rdev->flags)) { mddev->sb_dirty = 1; if (test_bit(In_sync, &rdev->flags)) { conf->working_disks--; @@ -650,7 +650,6 @@ static void error(mddev_t *mddev, mdk_rd */ set_bit(MD_RECOVERY_ERR, &mddev->recovery); } - set_bit(Faulty, &rdev->flags); printk (KERN_ALERT "raid5: Disk failure on %s, disabling device." " Operation continuing on %d devices\n", diff --git a/drivers/md/raid6main.c b/drivers/md/raid6main.c index d3deedb..fc0b31d 100644 --- a/drivers/md/raid6main.c +++ b/drivers/md/raid6main.c @@ -527,7 +527,7 @@ static void error(mddev_t *mddev, mdk_rd raid6_conf_t *conf = (raid6_conf_t *) mddev->private; PRINTK("raid6: error called\n"); - if (!test_bit(Faulty, &rdev->flags)) { + if (!test_and_set_bit(Faulty, &rdev->flags)) { mddev->sb_dirty = 1; if (test_bit(In_sync, &rdev->flags)) { conf->working_disks--; @@ -539,7 +539,6 @@ static void error(mddev_t *mddev, mdk_rd */ set_bit(MD_RECOVERY_ERR, &mddev->recovery); } - set_bit(Faulty, &rdev->flags); printk (KERN_ALERT "raid6: Disk failure on %s, disabling device." " Operation continuing on %d devices\n", > > Coywolf: As I think I have already said, I appreciate your review of > the md/raid code and your attempts to improve it - I'm sure there is > plenty of room to make improvements. > However posting patches with minimal commentary on code that you don't > fully understand is not the best way to work with the community. > If you see something that you think is wrong, it is much better to ask > why it is the way it is, explain why you think it isn't right, and > quite possibly include an example patch. Then we can discuss the > issue and find the best solution. > > So please feel free to post further patches, but please include more > commentary, and don't assume you understand something that you don't > really. > > Thanks, > NeilBrown > > > > > > > Signed-off-by: Coywolf Qi Hunt <qiyong@fc-cn.com> > > --- > > > > diff --git a/drivers/md/raid6main.c b/drivers/md/raid6main.c > > index bc69355..820536e 100644 > > --- a/drivers/md/raid6main.c > > +++ b/drivers/md/raid6main.c > > @@ -468,7 +468,6 @@ static int raid6_end_write_request (stru > > struct stripe_head *sh = bi->bi_private; > > raid6_conf_t *conf = sh->raid_conf; > > int disks = conf->raid_disks, i; > > - unsigned long flags; > > int uptodate = test_bit(BIO_UPTODATE, &bi->bi_flags); > > > > if (bi->bi_size) > > @@ -486,16 +485,14 @@ static int raid6_end_write_request (stru > > return 0; > > } > > > > - spin_lock_irqsave(&conf->device_lock, flags); > > if (!uptodate) > > md_error(conf->mddev, conf->disks[i].rdev); > > > > rdev_dec_pending(conf->disks[i].rdev, conf->mddev); > > - > > clear_bit(R5_LOCKED, &sh->dev[i].flags); > > set_bit(STRIPE_HANDLE, &sh->state); > > - __release_stripe(conf, sh); > > - spin_unlock_irqrestore(&conf->device_lock, flags); > > + release_stripe(sh); > > + > > return 0; > > } > > -- Coywolf Qi Hunt ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [patch 1/2] raid6_end_write_request() spinlock fix 2006-04-25 6:43 ` Coywolf Qi Hunt @ 2006-04-25 6:50 ` Neil Brown 2006-04-25 8:07 ` Coywolf Qi Hunt 0 siblings, 1 reply; 5+ messages in thread From: Neil Brown @ 2006-04-25 6:50 UTC (permalink / raw) To: Coywolf Qi Hunt; +Cc: akpm, linux-kernel, linux-raid On Tuesday April 25, qiyong@fc-cn.com wrote: > On Tue, Apr 25, 2006 at 03:13:49PM +1000, Neil Brown wrote: > > On Tuesday April 25, qiyong@fc-cn.com wrote: > > > Hello, > > > > > > Reduce the raid6_end_write_request() spinlock window. > > > > Andrew: please don't include these in -mm. This one and the > > corresponding raid5 are wrong, and I'm not sure yet the unplug_device > > changes. > > I am sure with the unplug_device. Just look follow the path... > What path? There are probably several. If I follow the path, will I see the same things as you see? Who knows, because you haven't bothered to tell us what you see. > > Yes. Let's fix the error(). In any case, the current code is broken. (see raid5/6_end_read_request) What will I see in raidX_end_read_request. Surely it isn't that hard to write a few more sentences? > Comments? Thanks. conf->working_disks isn't atomic_t and so decrementing without a spinlock isn't safe. So lets just leave it all inside a spinlock. Also I have a vague memory that clearing In_sync before Faulty is important, but I'm not certain of that. Remember: the code is there for a reason. It might not be a good reason, and the code could well be wrong. But it would be worth your effort trying to find out what the reason is before blithely changing it (as I discovered recently with a change I suggested to invalidate_mapping_pages). NeilBrown ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch 1/2] raid6_end_write_request() spinlock fix 2006-04-25 6:50 ` Neil Brown @ 2006-04-25 8:07 ` Coywolf Qi Hunt 0 siblings, 0 replies; 5+ messages in thread From: Coywolf Qi Hunt @ 2006-04-25 8:07 UTC (permalink / raw) To: Neil Brown; +Cc: akpm, linux-kernel, linux-raid On Tue, Apr 25, 2006 at 04:50:10PM +1000, Neil Brown wrote: > On Tuesday April 25, qiyong@fc-cn.com wrote: > > On Tue, Apr 25, 2006 at 03:13:49PM +1000, Neil Brown wrote: > > > On Tuesday April 25, qiyong@fc-cn.com wrote: > > > > Hello, > > > > > > > > Reduce the raid6_end_write_request() spinlock window. > > > > > > Andrew: please don't include these in -mm. This one and the > > > corresponding raid5 are wrong, and I'm not sure yet the unplug_device > > > changes. > > > > I am sure with the unplug_device. Just look follow the path... > > > > What path? There are probably several. If I follow the path, will I > see the same things as you see? Who knows, because you haven't > bothered to tell us what you see. There are only two places where handle_list is possibly re-filled: __release_stripe() and raidX_activate_delayed(). So raidXd should only wakeup after these two points. > > > > > Yes. Let's fix the error(). In any case, the current code is broken. (see raid5/6_end_read_request) > > What will I see in raidX_end_read_request. Surely it isn't that hard > to write a few more sentences? You should see md_error() in raidX_end_read_request isn't in any spinlocks. > conf->working_disks isn't atomic_t and so decrementing without a > spinlock isn't safe. So lets just leave it all inside a spinlock. test_and_set_bit(Faulty, &rdev->flags) protects it as well imho. It can be enter only once. > > Also I have a vague memory that clearing In_sync before Faulty is > important, but I'm not certain of that. Maybe, but seems not apply here. > > Remember: the code is there for a reason. It might not be a good > reason, and the code could well be wrong. But it would be worth your > effort trying to find out what the reason is before blithely changing > it (as I discovered recently with a change I suggested to > invalidate_mapping_pages). Thanks :) -- Coywolf Qi Hunt ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-04-25 8:07 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-04-25 3:35 [patch 1/2] raid6_end_write_request() spinlock fix Coywolf Qi Hunt 2006-04-25 5:13 ` Neil Brown 2006-04-25 6:43 ` Coywolf Qi Hunt 2006-04-25 6:50 ` Neil Brown 2006-04-25 8:07 ` Coywolf Qi Hunt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).