From: Mike Snitzer <snitzer@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>,
linux-nfs@vger.kernel.org, linux-mm@kvack.org,
Jens Axboe <axboe@fb.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
device-mapper development <dm-devel@redhat.com>,
linux-mtd@lists.infradead.org, Tejun Heo <tj@kernel.org>,
ceph-devel@vger.kernel.org, Jeff Moyer <jmoyer@redhat.com>
Subject: Re: [PATCH 11/12] fs: don't reassign dirty inodes to default_backing_dev_info
Date: Mon, 23 Mar 2015 18:40:13 -0400 [thread overview]
Message-ID: <20150323224012.GA29505@redhat.com> (raw)
In-Reply-To: <CAMM=eLe6Tt+g7dLcnn5a1fQboDknkasazsMiOFBziWPZemnYtg@mail.gmail.com>
On Sat, Mar 21 2015 at 11:11am -0400,
Mike Snitzer <snitzer@redhat.com> wrote:
> On Wed, Jan 14, 2015 at 4:42 AM, Christoph Hellwig <hch@lst.de> wrote:
> > If we have dirty inodes we need to call the filesystem for it, even if the
> > device has been removed and the filesystem will error out early. The
> > current code does that by reassining all dirty inodes to the default
> > backing_dev_info when a bdi is unlinked, but that's pretty pointless given
> > that the bdi must always outlive the super block.
> >
> > Instead of stopping writeback at unregister time and moving inodes to the
> > default bdi just keep the current bdi alive until it is destroyed. The
> > containing objects of the bdi ensure this doesn't happen until all
> > writeback has finished by erroring out.
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > Reviewed-by: Tejun Heo <tj@kernel.org>
> > ---
> > mm/backing-dev.c | 91 +++++++++++++++-----------------------------------------
> > 1 file changed, 24 insertions(+), 67 deletions(-)
>
> Hey Christoph,
>
> Just a heads up: your commit c4db59d31e39ea067c32163ac961e9c80198fd37
> is suspected as the first bad commit in a bisect performed to track
> down the cause of DM crashes reported in this BZ:
> https://bugzilla.redhat.com/show_bug.cgi?id=1202449
>
> I've yet to look closely at _why_ this commit but figured I'd share
> since this appears to be a 4.0-rcX regression.
FYI, here is the DM fix I've staged for 4.0-rc6. I'll continue testing
the various DM targets before requesting Linus to pull.
>From 63a4f065ece613b6d575b538234375b0e9c23bbc Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer@redhat.com>
Date: Mon, 23 Mar 2015 17:01:43 -0400
Subject: [PATCH] dm: fix add_disk() NULL pointer due to race with free_dev()
Commit c4db59d31e39 ("fs: don't reassign dirty inodes to
default_backing_dev_info") exposed DM to a latent race in free_dev() vs
add_disk() in relation to management of the device's minor number.
Fix this by refactoring free_dev() to match cleanup order of the
alloc_dev() error path. Move cleanup of the gendisk, queue, and bdev
to _before_ the cleanup of the idr managed minor number.
Also, purely due to cleanup that fell out during the free_dev() audit:
- adjust dm_blk_close() to access the gendisk's private_data under
the _minor_lock spinlock.
- move __dm_destroy()'s dm_get_live_table() call out from under the
_minor_lock spinlock.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1202449
Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
drivers/md/dm.c | 26 ++++++++++++++++----------
1 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 9b641b3..8001fe9 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -433,7 +433,6 @@ static int dm_blk_open(struct block_device *bdev, fmode_t mode)
dm_get(md);
atomic_inc(&md->open_count);
-
out:
spin_unlock(&_minor_lock);
@@ -442,16 +441,20 @@ out:
static void dm_blk_close(struct gendisk *disk, fmode_t mode)
{
- struct mapped_device *md = disk->private_data;
+ struct mapped_device *md;
spin_lock(&_minor_lock);
+ md = disk->private_data;
+ if (WARN_ON(!md))
+ goto out;
+
if (atomic_dec_and_test(&md->open_count) &&
(test_bit(DMF_DEFERRED_REMOVE, &md->flags)))
queue_work(deferred_remove_workqueue, &deferred_remove_work);
dm_put(md);
-
+out:
spin_unlock(&_minor_lock);
}
@@ -2241,7 +2244,6 @@ static void free_dev(struct mapped_device *md)
int minor = MINOR(disk_devt(md->disk));
unlock_fs(md);
- bdput(md->bdev);
destroy_workqueue(md->wq);
if (md->kworker_task)
@@ -2252,19 +2254,22 @@ static void free_dev(struct mapped_device *md)
mempool_destroy(md->rq_pool);
if (md->bs)
bioset_free(md->bs);
- blk_integrity_unregister(md->disk);
- del_gendisk(md->disk);
+
cleanup_srcu_struct(&md->io_barrier);
free_table_devices(&md->table_devices);
- free_minor(minor);
+ dm_stats_cleanup(&md->stats);
spin_lock(&_minor_lock);
md->disk->private_data = NULL;
spin_unlock(&_minor_lock);
-
+ if (blk_get_integrity(md->disk))
+ blk_integrity_unregister(md->disk);
+ del_gendisk(md->disk);
put_disk(md->disk);
blk_cleanup_queue(md->queue);
- dm_stats_cleanup(&md->stats);
+ bdput(md->bdev);
+ free_minor(minor);
+
module_put(THIS_MODULE);
kfree(md);
}
@@ -2642,8 +2647,9 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
might_sleep();
- spin_lock(&_minor_lock);
map = dm_get_live_table(md, &srcu_idx);
+
+ spin_lock(&_minor_lock);
idr_replace(&_minor_idr, MINOR_ALLOCED, MINOR(disk_devt(dm_disk(md))));
set_bit(DMF_FREEING, &md->flags);
spin_unlock(&_minor_lock);
--
1.7.4.4
next prev parent reply other threads:[~2015-03-23 22:40 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-14 9:42 backing_dev_info cleanups & lifetime rule fixes V2 Christoph Hellwig
2015-01-14 9:42 ` [PATCH 01/12] fs: deduplicate noop_backing_dev_info Christoph Hellwig
2015-01-14 12:41 ` Jan Kara
2015-01-14 9:42 ` [PATCH 02/12] fs: kill BDI_CAP_SWAP_BACKED Christoph Hellwig
2015-01-14 12:47 ` Jan Kara
2015-01-14 9:42 ` [PATCH 03/12] fs: introduce f_op->mmap_capabilities for nommu mmap support Christoph Hellwig
[not found] ` <1421228561-16857-4-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-01-16 4:06 ` Brian Norris
2015-01-14 9:42 ` [PATCH 04/12] block_dev: only write bdev inode on close Christoph Hellwig
2015-01-14 12:58 ` Jan Kara
2015-01-14 9:42 ` [PATCH 05/12] block_dev: get bdev inode bdi directly from the block device Christoph Hellwig
2015-01-14 13:00 ` Jan Kara
2015-01-14 9:42 ` [PATCH 06/12] nilfs2: set up s_bdi like the generic mount_bdev code Christoph Hellwig
2015-01-14 13:05 ` Jan Kara
2015-01-14 9:42 ` [PATCH 07/12] fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info Christoph Hellwig
2015-01-14 13:31 ` Jan Kara
2015-01-14 9:42 ` [PATCH 08/12] fs: remove mapping->backing_dev_info Christoph Hellwig
2015-01-15 10:57 ` Jan Kara
[not found] ` <1421228561-16857-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-01-14 9:42 ` [PATCH 09/12] ceph: remove call to bdi_unregister Christoph Hellwig
2015-01-14 13:44 ` Jan Kara
2015-01-14 9:42 ` [PATCH 10/12] nfs: don't call bdi_unregister Christoph Hellwig
2015-01-14 13:51 ` Jan Kara
2015-01-14 9:42 ` [PATCH 11/12] fs: don't reassign dirty inodes to default_backing_dev_info Christoph Hellwig
[not found] ` <1421228561-16857-12-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-01-14 13:59 ` Jan Kara
2015-03-21 15:11 ` Mike Snitzer
2015-03-23 22:40 ` Mike Snitzer [this message]
2015-03-24 6:53 ` Christoph Hellwig
2015-01-14 9:42 ` [PATCH 12/12] fs: remove default_backing_dev_info Christoph Hellwig
2015-01-14 14:05 ` Jan Kara
2015-01-20 21:08 ` backing_dev_info cleanups & lifetime rule fixes V2 Jens Axboe
2015-02-01 6:31 ` Al Viro
[not found] ` <20150201063116.GP29656-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2015-02-02 8:06 ` Christoph Hellwig
2015-02-02 17:08 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2015-01-08 17:45 backing_dev_info cleanups & lifetime rule fixes Christoph Hellwig
2015-01-08 17:45 ` [PATCH 11/12] fs: don't reassign dirty inodes to default_backing_dev_info Christoph Hellwig
2015-01-11 18:33 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150323224012.GA29505@redhat.com \
--to=snitzer@redhat.com \
--cc=axboe@fb.com \
--cc=ceph-devel@vger.kernel.org \
--cc=dhowells@redhat.com \
--cc=dm-devel@redhat.com \
--cc=hch@lst.de \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-mtd@lists.infradead.org \
--cc=linux-nfs@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).