From: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
To: device-mapper development <dm-devel@redhat.com>
Cc: linux-kernel@vger.kernel.org
Subject: [PATCH] dm: noflush resizing (2/3) Use bdlookup() on noflush suspend
Date: Wed, 24 Oct 2007 17:25:48 -0400 [thread overview]
Message-ID: <471FB85C.5080707@ce.jp.nec.com> (raw)
This patch makes device-mapper to use bdlookup to allow resizing
of noflush-suspended device.
With this patch, by using bdlookup() instead of bdget(),
resizing will no longer wait for I_LOCK.
There is a race between bdlookup() and bdget().
However, it's benign because the only use of bdev obtained by
bdlookup() is to modify i_size and bdget() will anyway see the
updated disk size.
(See below)
Case 1:
CPU 0 CPU 1
------------------------------------------------------------------
set_capacity()
disk size is updated
bdlookup() -> NULL
bdget()
bdget reads updated disk size
and set it to bdev inode's i_size
Case 2:
CPU 0 CPU 1
------------------------------------------------------------------
set_capacity()
disk size is updated
bdget()
bdget reads updated disk size
and set it to bdev inode's i_size
bdlookup()
wait for the bdget completion (I_NEW)
and get the bdev inode with updated disk size
Case 3:
CPU 0 CPU 1
------------------------------------------------------------------
bdget()
bdget reads old disk size
set_capacity()
disk size is updated
bdlookup()
get the existing bdev inode
update the i_size
To change i_size of bdev inode, the new code uses bd_mutex
instead of inode->i_mutex. According to a comment in
include/linux/fs.h, mutex is necessary here only to serialize
i_size_write().
bd_inode->i_size is set via bd_set_size() and bd_set_size()
is protected by bd_mutex.
So I think bd_mutex is the lock we should use.
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
---
dm.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)
Index: linux-2.6.23.work/drivers/md/dm.c
===================================================================
--- linux-2.6.23.work.orig/drivers/md/dm.c
+++ linux-2.6.23.work/drivers/md/dm.c
@@ -1101,11 +1101,29 @@ static void event_callback(void *context
static void __set_size(struct mapped_device *md, sector_t size)
{
+ struct block_device *bdev;
+
set_capacity(md->disk, size);
- mutex_lock(&md->suspended_bdev->bd_inode->i_mutex);
- i_size_write(md->suspended_bdev->bd_inode, (loff_t)size << SECTOR_SHIFT);
- mutex_unlock(&md->suspended_bdev->bd_inode->i_mutex);
+ /*
+ * Updated disk size should be visible to bdget() possibly
+ * called in parallel to bdlookup()
+ */
+ smp_mb();
+
+ if (md->suspended_bdev)
+ bdev = md->suspended_bdev;
+ else
+ bdev = bdlookup_disk(md->disk, 0);
+
+ if (bdev) {
+ mutex_lock(&bdev->bd_mutex);
+ i_size_write(bdev->bd_inode, (loff_t)size << SECTOR_SHIFT);
+ mutex_unlock(&bdev->bd_mutex);
+ }
+
+ if (!md->suspended_bdev)
+ bdput(bdev);
}
static int __bind(struct mapped_device *md, struct dm_table *t)
@@ -1121,8 +1139,7 @@ static int __bind(struct mapped_device *
if (size != get_capacity(md->disk))
memset(&md->geometry, 0, sizeof(md->geometry));
- if (md->suspended_bdev)
- __set_size(md, size);
+ __set_size(md, size);
if (size == 0)
return 0;
reply other threads:[~2007-10-24 21:25 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=471FB85C.5080707@ce.jp.nec.com \
--to=j-nomura@ce.jp.nec.com \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.