All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
To: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>
Cc: device-mapper development <dm-devel@redhat.com>,
	Alasdair Kergon <agk@redhat.com>
Subject: [PATCH 2.6.20-rc5] dm-multipath: fix stall on noflush suspend/resume
Date: Fri, 19 Jan 2007 19:47:27 -0500	[thread overview]
Message-ID: <45B1669F.90102@ce.jp.nec.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1114 bytes --]

Allow noflush suspend/resume of device-mapper device only for
the case where the device size is unchanged.

Otherwise, dm-multipath devices can stall when resumed if noflush
was used when suspending them, all paths have failed and
queue_if_no_path is set.

Explanation:
 1. Something is doing fsync() on the block dev,
    holding inode->i_sem
 2. The fsync write is blocked by all-paths-down and queue_if_no_path
 3. Someone requests to suspend the dm device with noflush.
    Pending writes are left in queue.
 4. In the middle of dm_resume(), __bind() tries to get
    inode->i_sem to do __set_size() and waits forever.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>

---
'noflush suspend' is a new device-mapper feature introduced in
early 2.6.20. So I hope the fix being included before 2.6.20 is
released.

Example of reproducer:
 1. Create a multipath device by dmsetup
 2. Fail all paths during mkfs
 3. Do dmsetup suspend --noflush and load new map with healthy paths
 4. Do dmsetup resume


 drivers/md/dm.c |   27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)


[-- Attachment #2: dm-noflush-fix-stall-on-resume.patch --]
[-- Type: text/x-patch, Size: 1671 bytes --]

Index: linux-2.6.19/drivers/md/dm.c
===================================================================
--- linux-2.6.19.orig/drivers/md/dm.c	2006-12-12 22:02:29.000000000 +0000
+++ linux-2.6.19/drivers/md/dm.c	2007-01-05 15:15:53.000000000 +0000
@@ -1116,7 +1116,8 @@ static int __bind(struct mapped_device *
 	if (size != get_capacity(md->disk))
 		memset(&md->geometry, 0, sizeof(md->geometry));
 
-	__set_size(md, size);
+	if (md->suspended_bdev)
+		__set_size(md, size);
 	if (size == 0)
 		return 0;
 
@@ -1265,6 +1266,11 @@ int dm_swap_table(struct mapped_device *
 	if (!dm_suspended(md))
 		goto out;
 
+	/* without bdev, the device size cannot be changed */
+	if (!md->suspended_bdev)
+		if (get_capacity(md->disk) != dm_table_get_size(table))
+			goto out;
+
 	__unbind(md);
 	r = __bind(md, table);
 
@@ -1342,11 +1348,14 @@ int dm_suspend(struct mapped_device *md,
 	/* This does not get reverted if there's an error later. */
 	dm_table_presuspend_targets(map);
 
-	md->suspended_bdev = bdget_disk(md->disk, 0);
-	if (!md->suspended_bdev) {
-		DMWARN("bdget failed in dm_suspend");
-		r = -ENOMEM;
-		goto flush_and_out;
+	/* bdget() can stall if the pending I/Os are not flushed */
+	if (!noflush) {
+		md->suspended_bdev = bdget_disk(md->disk, 0);
+		if (!md->suspended_bdev) {
+			DMWARN("bdget failed in dm_suspend");
+			r = -ENOMEM;
+			goto flush_and_out;
+		}
 	}
 
 	/*
@@ -1474,8 +1483,10 @@ int dm_resume(struct mapped_device *md)
 
 	unlock_fs(md);
 
-	bdput(md->suspended_bdev);
-	md->suspended_bdev = NULL;
+	if (md->suspended_bdev) {
+		bdput(md->suspended_bdev);
+		md->suspended_bdev = NULL;
+	}
 
 	clear_bit(DMF_SUSPENDED, &md->flags);
 

[-- Attachment #3: Type: text/plain, Size: 0 bytes --]



WARNING: multiple messages have this Message-ID (diff)
From: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
To: linux-kernel@vger.kernel.org, Andrew Morton <akpm@osdl.org>
Cc: Alasdair Kergon <agk@redhat.com>,
	device-mapper development <dm-devel@redhat.com>
Subject: [PATCH 2.6.20-rc5] dm-multipath: fix stall on noflush suspend/resume
Date: Fri, 19 Jan 2007 19:47:27 -0500	[thread overview]
Message-ID: <45B1669F.90102@ce.jp.nec.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1114 bytes --]

Allow noflush suspend/resume of device-mapper device only for
the case where the device size is unchanged.

Otherwise, dm-multipath devices can stall when resumed if noflush
was used when suspending them, all paths have failed and
queue_if_no_path is set.

Explanation:
 1. Something is doing fsync() on the block dev,
    holding inode->i_sem
 2. The fsync write is blocked by all-paths-down and queue_if_no_path
 3. Someone requests to suspend the dm device with noflush.
    Pending writes are left in queue.
 4. In the middle of dm_resume(), __bind() tries to get
    inode->i_sem to do __set_size() and waits forever.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>

---
'noflush suspend' is a new device-mapper feature introduced in
early 2.6.20. So I hope the fix being included before 2.6.20 is
released.

Example of reproducer:
 1. Create a multipath device by dmsetup
 2. Fail all paths during mkfs
 3. Do dmsetup suspend --noflush and load new map with healthy paths
 4. Do dmsetup resume


 drivers/md/dm.c |   27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)


[-- Attachment #2: dm-noflush-fix-stall-on-resume.patch --]
[-- Type: text/x-patch, Size: 1671 bytes --]

Index: linux-2.6.19/drivers/md/dm.c
===================================================================
--- linux-2.6.19.orig/drivers/md/dm.c	2006-12-12 22:02:29.000000000 +0000
+++ linux-2.6.19/drivers/md/dm.c	2007-01-05 15:15:53.000000000 +0000
@@ -1116,7 +1116,8 @@ static int __bind(struct mapped_device *
 	if (size != get_capacity(md->disk))
 		memset(&md->geometry, 0, sizeof(md->geometry));
 
-	__set_size(md, size);
+	if (md->suspended_bdev)
+		__set_size(md, size);
 	if (size == 0)
 		return 0;
 
@@ -1265,6 +1266,11 @@ int dm_swap_table(struct mapped_device *
 	if (!dm_suspended(md))
 		goto out;
 
+	/* without bdev, the device size cannot be changed */
+	if (!md->suspended_bdev)
+		if (get_capacity(md->disk) != dm_table_get_size(table))
+			goto out;
+
 	__unbind(md);
 	r = __bind(md, table);
 
@@ -1342,11 +1348,14 @@ int dm_suspend(struct mapped_device *md,
 	/* This does not get reverted if there's an error later. */
 	dm_table_presuspend_targets(map);
 
-	md->suspended_bdev = bdget_disk(md->disk, 0);
-	if (!md->suspended_bdev) {
-		DMWARN("bdget failed in dm_suspend");
-		r = -ENOMEM;
-		goto flush_and_out;
+	/* bdget() can stall if the pending I/Os are not flushed */
+	if (!noflush) {
+		md->suspended_bdev = bdget_disk(md->disk, 0);
+		if (!md->suspended_bdev) {
+			DMWARN("bdget failed in dm_suspend");
+			r = -ENOMEM;
+			goto flush_and_out;
+		}
 	}
 
 	/*
@@ -1474,8 +1483,10 @@ int dm_resume(struct mapped_device *md)
 
 	unlock_fs(md);
 
-	bdput(md->suspended_bdev);
-	md->suspended_bdev = NULL;
+	if (md->suspended_bdev) {
+		bdput(md->suspended_bdev);
+		md->suspended_bdev = NULL;
+	}
 
 	clear_bit(DMF_SUSPENDED, &md->flags);
 

             reply	other threads:[~2007-01-20  0:47 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-20  0:47 Jun'ichi Nomura [this message]
2007-01-20  0:47 ` [PATCH 2.6.20-rc5] dm-multipath: fix stall on noflush suspend/resume Jun'ichi Nomura

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45B1669F.90102@ce.jp.nec.com \
    --to=j-nomura@ce.jp.nec.com \
    --cc=agk@redhat.com \
    --cc=akpm@osdl.org \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.