All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: NeilBrown <neilb@suse.de>
Cc: Jens Axboe <jaxboe@fusionio.com>, linux-kernel@vger.kernel.org
Subject: Re: blk_throtl_exit  taking q->queue_lock is problematic
Date: Fri, 18 Feb 2011 10:05:47 -0500	[thread overview]
Message-ID: <20110218150547.GC26654@redhat.com> (raw)
In-Reply-To: <20110218134025.2a2e5bbb@notabene.brown>

On Fri, Feb 18, 2011 at 01:40:25PM +1100, NeilBrown wrote:
> On Thu, 17 Feb 2011 11:59:06 -0500 Vivek Goyal <vgoyal@redhat.com> wrote:
> 
> > On Thu, Feb 17, 2011 at 04:55:01PM +1100, NeilBrown wrote:
> > > On Wed, 16 Feb 2011 20:10:29 -0500 Vivek Goyal <vgoyal@redhat.com> wrote:
> > > 
> > > > So is it possible to keep the spinlock intact when md is calling up
> > > > blk_cleanup_queue()?
> > > > 
> > > 
> > > It would be possible, yes - but messy.  I would probably end up just making
> > > ->queue_lock always point to __queue_lock, and then only take it at the
> > > places where I call 'block' code which wants to test to see if it is
> > > currently held (like the plugging routines).
> > > 
> > > The queue lock (and most of the request queue) is simply irrelevant for md.
> > > I would prefer to get away from having to touch it at all...
> > 
> > If queue lock is mostly irrelevant for md, then why md should provide an
> > external lock and not use queue's internal spin lock?
> 
> See other email - historical reasons mostly.
> 
> > 
> > > 
> > > I'll see how messy it would be to stop using it completely and it can just be
> > > __queue_lock.
> > > 
> > > Though for me - it would be much easier if you just used __queue_lock .....
> > 
> > Ok, here is the simple patch which splits the queue lock and uses
> > throtl_lock for throttling logic. I booted and it seems to be working.
> > 
> > Having said that, this now introduces the possibility of races for any
> > services I take from request queue. I need to see if I need to take
> > queue lock and that makes me little uncomfortable. 
> > 
> > I am using kblockd_workqueue to queue throtl work. Looks like I don't
> > need queue lock for that. I am also using block tracing infrastructure
> > and my understanding is that I don't need queue lock for that as well.
> > 
> > So if we do this change for performance reasons, it still makes sense
> > but doing this change because md provided a q->queue_lock and took away that
> > lock without notifying block layer hence we do this change, is still not
> > the right reason, IMHO.
> 
> Well...I like that patch, as it makes my life easier....
> 
> But I agree that md is doing something wrong.  Now that ->queue_lock is
> always initialised, it is wrong to leave it in a state where it not defined.
> 
> So maybe I'll apply this (after testing it a bit.  The only reason for taking
> the lock queue_lock in a couple of places is to silence some warnings.
> 
> Thanks,
> NeilBrown

Thanks Neil. This looks much better in the sense that now we know that
queue_lock is still valid at the blk_cleanup_queue() time and did not
vanish unexpectdly.

Thanks
Vivek

> 
> 
> diff --git a/drivers/md/linear.c b/drivers/md/linear.c
> index 8a2f767..0ed7f6b 100644
> --- a/drivers/md/linear.c
> +++ b/drivers/md/linear.c
> @@ -216,7 +216,6 @@ static int linear_run (mddev_t *mddev)
>  
>  	if (md_check_no_bitmap(mddev))
>  		return -EINVAL;
> -	mddev->queue->queue_lock = &mddev->queue->__queue_lock;
>  	conf = linear_conf(mddev, mddev->raid_disks);
>  
>  	if (!conf)
> diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
> index 6d7ddf3..3a62d44 100644
> --- a/drivers/md/multipath.c
> +++ b/drivers/md/multipath.c
> @@ -435,7 +435,6 @@ static int multipath_run (mddev_t *mddev)
>  	 * bookkeeping area. [whatever we allocate in multipath_run(),
>  	 * should be freed in multipath_stop()]
>  	 */
> -	mddev->queue->queue_lock = &mddev->queue->__queue_lock;
>  
>  	conf = kzalloc(sizeof(multipath_conf_t), GFP_KERNEL);
>  	mddev->private = conf;
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index 75671df..c0ac457 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -361,7 +361,6 @@ static int raid0_run(mddev_t *mddev)
>  	if (md_check_no_bitmap(mddev))
>  		return -EINVAL;
>  	blk_queue_max_hw_sectors(mddev->queue, mddev->chunk_sectors);
> -	mddev->queue->queue_lock = &mddev->queue->__queue_lock;
>  
>  	/* if private is not null, we are here after takeover */
>  	if (mddev->private == NULL) {
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index a23ffa3..909282d 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -593,7 +593,9 @@ static int flush_pending_writes(conf_t *conf)
>  	if (conf->pending_bio_list.head) {
>  		struct bio *bio;
>  		bio = bio_list_get(&conf->pending_bio_list);
> +		spin_lock(conf->mddev->queue->queue_lock);
>  		blk_remove_plug(conf->mddev->queue);
> +		spin_unlock(conf->mddev->queue->queue_lock);
>  		spin_unlock_irq(&conf->device_lock);
>  		/* flush any pending bitmap writes to
>  		 * disk before proceeding w/ I/O */
> @@ -959,7 +961,9 @@ static int make_request(mddev_t *mddev, struct bio * bio)
>  		atomic_inc(&r1_bio->remaining);
>  		spin_lock_irqsave(&conf->device_lock, flags);
>  		bio_list_add(&conf->pending_bio_list, mbio);
> +		spin_lock(mddev->queue->queue_lock);
>  		blk_plug_device(mddev->queue);
> +		spin_unlock(mddev->queue->queue_lock);
>  		spin_unlock_irqrestore(&conf->device_lock, flags);
>  	}
>  	r1_bio_write_done(r1_bio, bio->bi_vcnt, behind_pages, behind_pages != NULL);
> @@ -2021,7 +2025,6 @@ static int run(mddev_t *mddev)
>  	if (IS_ERR(conf))
>  		return PTR_ERR(conf);
>  
> -	mddev->queue->queue_lock = &conf->device_lock;
>  	list_for_each_entry(rdev, &mddev->disks, same_set) {
>  		disk_stack_limits(mddev->gendisk, rdev->bdev,
>  				  rdev->data_offset << 9);
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 3b607b2..60e6cb1 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -662,7 +662,9 @@ static int flush_pending_writes(conf_t *conf)
>  	if (conf->pending_bio_list.head) {
>  		struct bio *bio;
>  		bio = bio_list_get(&conf->pending_bio_list);
> +		spin_lock(conf->mddev->queue->queue_lock);
>  		blk_remove_plug(conf->mddev->queue);
> +		spin_unlock(conf->mddev->queue->queue_lock);
>  		spin_unlock_irq(&conf->device_lock);
>  		/* flush any pending bitmap writes to disk
>  		 * before proceeding w/ I/O */
> @@ -970,8 +972,10 @@ static int make_request(mddev_t *mddev, struct bio * bio)
>  
>  		atomic_inc(&r10_bio->remaining);
>  		spin_lock_irqsave(&conf->device_lock, flags);
> +		spin_lock(mddev->queue->queue_lock);
>  		bio_list_add(&conf->pending_bio_list, mbio);
>  		blk_plug_device(mddev->queue);
> +		spin_unlock(mddev->queue->queue_lock);
>  		spin_unlock_irqrestore(&conf->device_lock, flags);
>  	}
>  
> @@ -2304,8 +2308,6 @@ static int run(mddev_t *mddev)
>  	if (!conf)
>  		goto out;
>  
> -	mddev->queue->queue_lock = &conf->device_lock;
> -
>  	mddev->thread = conf->thread;
>  	conf->thread = NULL;
>  
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 7028128..78536fd 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -5204,7 +5204,6 @@ static int run(mddev_t *mddev)
>  
>  		mddev->queue->backing_dev_info.congested_data = mddev;
>  		mddev->queue->backing_dev_info.congested_fn = raid5_congested;
> -		mddev->queue->queue_lock = &conf->device_lock;
>  		mddev->queue->unplug_fn = raid5_unplug_queue;
>  
>  		chunk_size = mddev->chunk_sectors << 9;

  parent reply	other threads:[~2011-02-18 15:05 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-16  7:31 blk_throtl_exit taking q->queue_lock is problematic NeilBrown
2011-02-16 15:53 ` Vivek Goyal
2011-02-17  0:35   ` NeilBrown
2011-02-17  1:10     ` Vivek Goyal
2011-02-17  5:55       ` NeilBrown
2011-02-17 15:01         ` Vivek Goyal
2011-02-17 16:59         ` Vivek Goyal
2011-02-18  2:40           ` NeilBrown
2011-02-18  3:19             ` Mike Snitzer
2011-02-18  3:33               ` NeilBrown
2011-02-18 14:04                 ` Mike Snitzer
2011-02-18 15:04                 ` Vivek Goyal
2011-02-21  7:24                   ` NeilBrown
2011-02-21 14:42                     ` Vivek Goyal
2011-02-18 15:05             ` Vivek Goyal [this message]
2011-02-17 20:00 ` Vivek Goyal
2011-02-18  1:57   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110218150547.GC26654@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.