linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PULL REQUEST] md bug fixes and minor improvements
@ 2008-08-01  3:02 Neil Brown
  2008-08-01 17:16 ` Linus Torvalds
  0 siblings, 1 reply; 9+ messages in thread
From: Neil Brown @ 2008-08-01  3:02 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Arthur Jones, Dan Williams, linux-kernel, linux-raid


Hi Linus,
 please pull the following bugfixes for drivers/md.

Thanks,
NeilBrown



The following changes since commit 6e86841d05f371b5b9b86ce76c02aaee83352298:
  Linus Torvalds (1):
        Linux 2.6.27-rc1

are available in the git repository at:

  git://neil.brown.name/md/ for-linus

Arthur Jones (1):
      md: raid10: wake up frozen array

Dan Williams (5):
      md: move async_tx_issue_pending_all outside spin_lock_irq
      md: fix merge error
      md: delay notification of 'active_idle' to the recovery thread
      md: do not progress the resync process if the stripe was blocked
      md: do not count blocked devices as spares

 drivers/md/md.c           |    8 ++++++--
 drivers/md/raid10.c       |    3 +++
 drivers/md/raid5.c        |   29 ++++++++++++++++++-----------
 include/linux/raid/md_k.h |    1 +
 4 files changed, 28 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL REQUEST] md bug fixes and minor improvements
  2008-08-01  3:02 [PULL REQUEST] md bug fixes and minor improvements Neil Brown
@ 2008-08-01 17:16 ` Linus Torvalds
  2008-08-01 17:22   ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2008-08-01 17:16 UTC (permalink / raw)
  To: Neil Brown
  Cc: Arthur Jones, Dan Williams, Linux Kernel Mailing List, linux-raid,
	Rafael J. Wysocki, Jens Axboe



On Fri, 1 Aug 2008, Neil Brown wrote:
> 
> Hi Linus,
>  please pull the following bugfixes for drivers/md.

Hmm. This doesn't seem to include any fix for the reported unlocked 
blk_plug() from MD?

See the emails from Rafael on the kernel mailing list for details 
(WARNING: at /home/rafael/src/linux-next/include/linux/blkdev.h:447), but 
it boils down to

	WARNING: at /home/rafael/src/linux-2.6/include/linux/blkdev.h:447 blk_plug_device+0x9b/0xb0()
	Pid: 2268, comm: kjournald Not tainted 2.6.27-rc1-git #211

	Call Trace:
	 [<ffffffff8023af5f>] warn_on_slowpath+0x5f/0x80
	 [<ffffffff8034fc7b>] blk_plug_device+0x9b/0xb0
	 [<ffffffff8044d5bf>] bitmap_startwrite+0xbf/0x1b0

where it really looks like "bitmap_startwrite()" just calls 
blk_plug_device() without holding the queue lock. The rule for that 
function is documented to be:

 * This is called with interrupts off and no requests on the queue and
 * with the queue lock held.

Hmm?

Now, admittedly, the blk interfaces here are a bit inconsistent: I think 
blk_unplug() is supposed to be called _without_ the lock, so it's a bit 
odd that blk_plug_device() is supposed to b called with it held, but 
somebody should double-check me on that one.

I guess Jens is gone too..

		Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL REQUEST] md bug fixes and minor improvements
  2008-08-01 17:16 ` Linus Torvalds
@ 2008-08-01 17:22   ` Jens Axboe
  2008-08-01 17:34     ` Dan Williams
  2008-08-01 18:18     ` Linus Torvalds
  0 siblings, 2 replies; 9+ messages in thread
From: Jens Axboe @ 2008-08-01 17:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Neil Brown, Arthur Jones, Dan Williams, Linux Kernel Mailing List,
	linux-raid, Rafael J. Wysocki

On Fri, Aug 01 2008, Linus Torvalds wrote:
> 
> 
> On Fri, 1 Aug 2008, Neil Brown wrote:
> > 
> > Hi Linus,
> >  please pull the following bugfixes for drivers/md.
> 
> Hmm. This doesn't seem to include any fix for the reported unlocked 
> blk_plug() from MD?
> 
> See the emails from Rafael on the kernel mailing list for details 
> (WARNING: at /home/rafael/src/linux-next/include/linux/blkdev.h:447), but 
> it boils down to
> 
> 	WARNING: at /home/rafael/src/linux-2.6/include/linux/blkdev.h:447 blk_plug_device+0x9b/0xb0()
> 	Pid: 2268, comm: kjournald Not tainted 2.6.27-rc1-git #211
> 
> 	Call Trace:
> 	 [<ffffffff8023af5f>] warn_on_slowpath+0x5f/0x80
> 	 [<ffffffff8034fc7b>] blk_plug_device+0x9b/0xb0
> 	 [<ffffffff8044d5bf>] bitmap_startwrite+0xbf/0x1b0
> 
> where it really looks like "bitmap_startwrite()" just calls 
> blk_plug_device() without holding the queue lock. The rule for that 
> function is documented to be:
> 
>  * This is called with interrupts off and no requests on the queue and
>  * with the queue lock held.
> 
> Hmm?
> 
> Now, admittedly, the blk interfaces here are a bit inconsistent: I think 
> blk_unplug() is supposed to be called _without_ the lock, so it's a bit 
> odd that blk_plug_device() is supposed to b called with it held, but 
> somebody should double-check me on that one.

It is a bit asymmetrical, largely due to the fact that the ->unplug_fn()
itself grabs the lock. The below patch should fix it, since Neil has
added a proper queue lock to the md queues. If someone can confirm that
this fixes it, I'll queue up a patch with proper descriptions.

> I guess Jens is gone too..

I'm back, just been busy this week :-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 621a272..f19b52f 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1234,7 +1234,9 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
 		case 0:
 			bitmap_file_set_bit(bitmap, offset);
 			bitmap_count_page(bitmap,offset, 1);
+			spin_lock_irq(&bitmap->mddev->queue->queue_lock);
 			blk_plug_device(bitmap->mddev->queue);
+			spin_unlock_irq(&bitmap->mddev->queue->queue_lock);
 			/* fall through */
 		case 1:
 			*bmc = 2;

-- 
Jens Axboe

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PULL REQUEST] md bug fixes and minor improvements
  2008-08-01 17:22   ` Jens Axboe
@ 2008-08-01 17:34     ` Dan Williams
  2008-08-01 17:40       ` Jens Axboe
  2008-08-01 18:18     ` Linus Torvalds
  1 sibling, 1 reply; 9+ messages in thread
From: Dan Williams @ 2008-08-01 17:34 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Linus Torvalds, Neil Brown, Arthur Jones,
	Linux Kernel Mailing List, linux-raid, Rafael J. Wysocki


On Fri, 2008-08-01 at 19:22 +0200, Jens Axboe wrote:
> It is a bit asymmetrical, largely due to the fact that the ->unplug_fn()
> itself grabs the lock. The below patch should fix it, since Neil has
> added a proper queue lock to the md queues. If someone can confirm that
> this fixes it, I'll queue up a patch with proper descriptions.
> 
> > I guess Jens is gone too..
> 
> I'm back, just been busy this week :-)
> 
> diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> index 621a272..f19b52f 100644
> --- a/drivers/md/bitmap.c
> +++ b/drivers/md/bitmap.c
> @@ -1234,7 +1234,9 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
>  		case 0:
>  			bitmap_file_set_bit(bitmap, offset);
>  			bitmap_count_page(bitmap,offset, 1);
> +			spin_lock_irq(&bitmap->mddev->queue->queue_lock);
>  			blk_plug_device(bitmap->mddev->queue);
> +			spin_unlock_irq(&bitmap->mddev->queue->queue_lock);
>  			/* fall through */
>  		case 1:
>  			*bmc = 2;
> 

We also need to protect the blk_plug_device call a few lines down (and
an obvious compile fix).

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index 621a272..c1b07e7 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -1224,7 +1224,9 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
 			prepare_to_wait(&bitmap->overflow_wait, &__wait,
 					TASK_UNINTERRUPTIBLE);
 			spin_unlock_irq(&bitmap->lock);
+			spin_lock_irq(bitmap->mddev->queue->queue_lock);
 			blk_unplug(bitmap->mddev->queue);
+			spin_unlock_irq(bitmap->mddev->queue->queue_lock);
 			schedule();
 			finish_wait(&bitmap->overflow_wait, &__wait);
 			continue;
@@ -1234,7 +1236,9 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
 		case 0:
 			bitmap_file_set_bit(bitmap, offset);
 			bitmap_count_page(bitmap,offset, 1);
+			spin_lock_irq(bitmap->mddev->queue->queue_lock);
 			blk_plug_device(bitmap->mddev->queue);
+			spin_unlock_irq(bitmap->mddev->queue->queue_lock);
 			/* fall through */
 		case 1:
 			*bmc = 2;




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PULL REQUEST] md bug fixes and minor improvements
  2008-08-01 17:34     ` Dan Williams
@ 2008-08-01 17:40       ` Jens Axboe
  2008-08-01 18:22         ` Dan Williams
  0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2008-08-01 17:40 UTC (permalink / raw)
  To: Dan Williams
  Cc: Linus Torvalds, Neil Brown, Arthur Jones,
	Linux Kernel Mailing List, linux-raid, Rafael J. Wysocki

On Fri, Aug 01 2008, Dan Williams wrote:
> 
> On Fri, 2008-08-01 at 19:22 +0200, Jens Axboe wrote:
> > It is a bit asymmetrical, largely due to the fact that the ->unplug_fn()
> > itself grabs the lock. The below patch should fix it, since Neil has
> > added a proper queue lock to the md queues. If someone can confirm that
> > this fixes it, I'll queue up a patch with proper descriptions.
> > 
> > > I guess Jens is gone too..
> > 
> > I'm back, just been busy this week :-)
> > 
> > diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> > index 621a272..f19b52f 100644
> > --- a/drivers/md/bitmap.c
> > +++ b/drivers/md/bitmap.c
> > @@ -1234,7 +1234,9 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
> >  		case 0:
> >  			bitmap_file_set_bit(bitmap, offset);
> >  			bitmap_count_page(bitmap,offset, 1);
> > +			spin_lock_irq(&bitmap->mddev->queue->queue_lock);
> >  			blk_plug_device(bitmap->mddev->queue);
> > +			spin_unlock_irq(&bitmap->mddev->queue->queue_lock);
> >  			/* fall through */
> >  		case 1:
> >  			*bmc = 2;
> > 
> 
> We also need to protect the blk_plug_device call a few lines down (and
> an obvious compile fix).

Old source I guess, just one blk_plug_device() in the copy I have here.
Just checked latest git, still just one blk_plug_device(), are you
diffing against -mm or something like that? Or linux-next?

And queue_lock is of course a pointer, I didn't even compile the
thing... Thanks for the updated variant!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL REQUEST] md bug fixes and minor improvements
  2008-08-01 17:22   ` Jens Axboe
  2008-08-01 17:34     ` Dan Williams
@ 2008-08-01 18:18     ` Linus Torvalds
  2008-08-01 18:22       ` Jens Axboe
  1 sibling, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2008-08-01 18:18 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Neil Brown, Arthur Jones, Dan Williams, Linux Kernel Mailing List,
	linux-raid, Rafael J. Wysocki



On Fri, 1 Aug 2008, Jens Axboe wrote:
> +			spin_lock_irq(&bitmap->mddev->queue->queue_lock);
>  			blk_plug_device(bitmap->mddev->queue);
> +			spin_unlock_irq(&bitmap->mddev->queue->queue_lock);

Can we please not have a chain of three dereferences in a row like that? 
That's an almost certain sign that we should either have a helper function 
or just a variable, and do it as

	queue = bitmap->mddev->queue;

	spin_lock_irq(&queue->queue_lock);
	blk_plug_device(queue);
	spin_unlock_irq(&queue->queue_lock);

Hmm? Perhaps the helper function is cleaner, ie

	static inline blk_plug_device_unlocked(struct request_queue * queue)
	{..

instead. That, of course, would have to use spin_lock_irqsave().

			Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL REQUEST] md bug fixes and minor improvements
  2008-08-01 17:40       ` Jens Axboe
@ 2008-08-01 18:22         ` Dan Williams
  2008-08-01 18:29           ` Jens Axboe
  0 siblings, 1 reply; 9+ messages in thread
From: Dan Williams @ 2008-08-01 18:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Linus Torvalds, Neil Brown, Arthur Jones,
	Linux Kernel Mailing List, linux-raid, Rafael J. Wysocki


On Fri, 2008-08-01 at 10:40 -0700, Jens Axboe wrote:
> On Fri, Aug 01 2008, Dan Williams wrote:
> >
> > On Fri, 2008-08-01 at 19:22 +0200, Jens Axboe wrote:
> > > It is a bit asymmetrical, largely due to the fact that the ->unplug_fn()
> > > itself grabs the lock. The below patch should fix it, since Neil has
> > > added a proper queue lock to the md queues. If someone can confirm that
> > > this fixes it, I'll queue up a patch with proper descriptions.
> > >
> > > > I guess Jens is gone too..
> > >
> > > I'm back, just been busy this week :-)
> > >
> > > diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> > > index 621a272..f19b52f 100644
> > > --- a/drivers/md/bitmap.c
> > > +++ b/drivers/md/bitmap.c
> > > @@ -1234,7 +1234,9 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
> > >             case 0:
> > >                     bitmap_file_set_bit(bitmap, offset);
> > >                     bitmap_count_page(bitmap,offset, 1);
> > > +                   spin_lock_irq(&bitmap->mddev->queue->queue_lock);
> > >                     blk_plug_device(bitmap->mddev->queue);
> > > +                   spin_unlock_irq(&bitmap->mddev->queue->queue_lock);
> > >                     /* fall through */
> > >             case 1:
> > >                     *bmc = 2;
> > >
> >
> > We also need to protect the blk_plug_device call a few lines down (and
> > an obvious compile fix).
> 
> Old source I guess, just one blk_plug_device() in the copy I have here.
> Just checked latest git, still just one blk_plug_device(), are you
> diffing against -mm or something like that? Or linux-next?

No, my mistake... I crossed my eyes and misread your patch as protecting
blk_unplug() a few lines up, sorry.
> 
> And queue_lock is of course a pointer, I didn't even compile the
> thing... Thanks for the updated variant!

I have verified that:

	mdadm --create /dev/md0 /dev/sd[bc] -n 2 -l 1 --bitmap=internal
	dd if=/dev/zero of=/dev/md0 bs=1024k count=1

...no longer triggers the warning with your fix.

Regards,
Dan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL REQUEST] md bug fixes and minor improvements
  2008-08-01 18:18     ` Linus Torvalds
@ 2008-08-01 18:22       ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2008-08-01 18:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Neil Brown, Arthur Jones, Dan Williams, Linux Kernel Mailing List,
	linux-raid, Rafael J. Wysocki

On Fri, Aug 01 2008, Linus Torvalds wrote:
> 
> 
> On Fri, 1 Aug 2008, Jens Axboe wrote:
> > +			spin_lock_irq(&bitmap->mddev->queue->queue_lock);
> >  			blk_plug_device(bitmap->mddev->queue);
> > +			spin_unlock_irq(&bitmap->mddev->queue->queue_lock);
> 
> Can we please not have a chain of three dereferences in a row like that? 
> That's an almost certain sign that we should either have a helper function 
> or just a variable, and do it as
> 
> 	queue = bitmap->mddev->queue;
> 
> 	spin_lock_irq(&queue->queue_lock);
> 	blk_plug_device(queue);
> 	spin_unlock_irq(&queue->queue_lock);
> 
> Hmm? Perhaps the helper function is cleaner, ie
> 
> 	static inline blk_plug_device_unlocked(struct request_queue * queue)
> 	{..
> 
> instead. That, of course, would have to use spin_lock_irqsave().

I rather like that. I've got a few simpler things to push, I'll queue it
up with that and send you a pull request later today.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PULL REQUEST] md bug fixes and minor improvements
  2008-08-01 18:22         ` Dan Williams
@ 2008-08-01 18:29           ` Jens Axboe
  0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2008-08-01 18:29 UTC (permalink / raw)
  To: Dan Williams
  Cc: Linus Torvalds, Neil Brown, Arthur Jones,
	Linux Kernel Mailing List, linux-raid, Rafael J. Wysocki

On Fri, Aug 01 2008, Dan Williams wrote:
> 
> On Fri, 2008-08-01 at 10:40 -0700, Jens Axboe wrote:
> > On Fri, Aug 01 2008, Dan Williams wrote:
> > >
> > > On Fri, 2008-08-01 at 19:22 +0200, Jens Axboe wrote:
> > > > It is a bit asymmetrical, largely due to the fact that the ->unplug_fn()
> > > > itself grabs the lock. The below patch should fix it, since Neil has
> > > > added a proper queue lock to the md queues. If someone can confirm that
> > > > this fixes it, I'll queue up a patch with proper descriptions.
> > > >
> > > > > I guess Jens is gone too..
> > > >
> > > > I'm back, just been busy this week :-)
> > > >
> > > > diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
> > > > index 621a272..f19b52f 100644
> > > > --- a/drivers/md/bitmap.c
> > > > +++ b/drivers/md/bitmap.c
> > > > @@ -1234,7 +1234,9 @@ int bitmap_startwrite(struct bitmap *bitmap, sector_t offset, unsigned long sect
> > > >             case 0:
> > > >                     bitmap_file_set_bit(bitmap, offset);
> > > >                     bitmap_count_page(bitmap,offset, 1);
> > > > +                   spin_lock_irq(&bitmap->mddev->queue->queue_lock);
> > > >                     blk_plug_device(bitmap->mddev->queue);
> > > > +                   spin_unlock_irq(&bitmap->mddev->queue->queue_lock);
> > > >                     /* fall through */
> > > >             case 1:
> > > >                     *bmc = 2;
> > > >
> > >
> > > We also need to protect the blk_plug_device call a few lines down (and
> > > an obvious compile fix).
> > 
> > Old source I guess, just one blk_plug_device() in the copy I have here.
> > Just checked latest git, still just one blk_plug_device(), are you
> > diffing against -mm or something like that? Or linux-next?
> 
> No, my mistake... I crossed my eyes and misread your patch as protecting
> blk_unplug() a few lines up, sorry.

Ah, didn't read that closely in your patch, that would get you into
trouble :-)

> > And queue_lock is of course a pointer, I didn't even compile the
> > thing... Thanks for the updated variant!
> 
> I have verified that:
> 
> 	mdadm --create /dev/md0 /dev/sd[bc] -n 2 -l 1 --bitmap=internal
> 	dd if=/dev/zero of=/dev/md0 bs=1024k count=1
> 
> ...no longer triggers the warning with your fix.

Goodie, thanks!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-08-01 18:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-01  3:02 [PULL REQUEST] md bug fixes and minor improvements Neil Brown
2008-08-01 17:16 ` Linus Torvalds
2008-08-01 17:22   ` Jens Axboe
2008-08-01 17:34     ` Dan Williams
2008-08-01 17:40       ` Jens Axboe
2008-08-01 18:22         ` Dan Williams
2008-08-01 18:29           ` Jens Axboe
2008-08-01 18:18     ` Linus Torvalds
2008-08-01 18:22       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).