public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* md related oops triggered in bdev_inode_switch_bdi
@ 2011-08-31  6:22 NeilBrown
  2011-09-01  3:30 ` Wu Fengguang
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2011-08-31  6:22 UTC (permalink / raw)
  To: Christoph Hellwig, Hugh Dickins, Andrew Morton, Wu Fengguang; +Cc: lkml


Hi Christoph et. al.,

 My testing recently triggered an oops in bdi_lock_two called from
 bdev_inode_switch_bdi.
 The bdi and the request_queue that contains it had been freed.

 This happens with md which can free the md device and request queue
 immediately after last close.

 It seems that this is caused by your patch f758eeabeb96f8.
 Prior to that the 'old' bdi was never dereferenced in
 bdev_inode_switch_bdi.  Now it is.

 I think we can fix that by simply moving the call to bdev_inode_switch_bdi
 before the call to ->release as in the patch below.

 Do you see any problem with this patch?

Thanks,
NeilBrown


>From bcc5851cbb6876c97cce214feddd0ec092f7d71c Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Wed, 31 Aug 2011 15:37:22 +1000
Subject: [PATCH] Avoid dereferencing a 'request_queue' after last close.

On the last close of an 'md' device which as been stopped, the device
is destroyed and in particular the request_queue is freed.  The free
is done in a separate thread so it might happen a short time later.

__blkdev_put calls bdev_inode_switch_bdi *after* ->release has been
called.

Since commit f758eeabeb96f878c860e8f110f94ec8820822a9
bdev_inode_switch_bdi will dereference the 'old' bdi, which lives
inside a request_queue, to get a spin lock.  This causes the last
close on an md device to sometime take a spin_lock which lives in
freed memory - which results in an oops.

So move the called to bdev_inode_switch_bdi before the call to
->release.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/fs/block_dev.c b/fs/block_dev.c
index ff77262..d8a753f 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1430,6 +1430,12 @@ static int __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 		sync_blockdev(bdev);
 		kill_bdev(bdev);
 	}
+	if (!bdev->bd_openers)
+		/* ->release can cause the old bdi to disappear,
+		 * so must switch it out first
+		 */
+		bdev_inode_switch_bdi(bdev->bd_inode,
+					&default_backing_dev_info);
 	if (bdev->bd_contains == bdev) {
 		if (disk->fops->release)
 			ret = disk->fops->release(disk, mode);
@@ -1442,8 +1448,6 @@ static int __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
 		disk_put_part(bdev->bd_part);
 		bdev->bd_part = NULL;
 		bdev->bd_disk = NULL;
-		bdev_inode_switch_bdi(bdev->bd_inode,
-					&default_backing_dev_info);
 		if (bdev != bdev->bd_contains)
 			victim = bdev->bd_contains;
 		bdev->bd_contains = NULL;

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-09-11  7:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-31  6:22 md related oops triggered in bdev_inode_switch_bdi NeilBrown
2011-09-01  3:30 ` Wu Fengguang
2011-09-01  5:49   ` NeilBrown
2011-09-09  8:56     ` Lin Ming
2011-09-11  7:01       ` Sitsofe Wheeler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox