All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Jack Wang <jinpu.wang@profitbricks.com>
Cc: linux-raid@vger.kernel.org, Jack Wang <xjtuwjp@gmail.com>,
	Sebastian Riemer <sebastian.riemer@profitbricks.com>
Subject: Re: [BUG] md hang at schedule in  md_write_start
Date: Tue, 13 Aug 2013 14:31:15 +1000	[thread overview]
Message-ID: <20130813143115.55dd27d1@notabene.brown> (raw)
In-Reply-To: <52090E6D.10104@profitbricks.com>

[-- Attachment #1: Type: text/plain, Size: 3636 bytes --]

On Mon, 12 Aug 2013 18:33:49 +0200 Jack Wang <jinpu.wang@profitbricks.com>
wrote:

> Hi Neil,
> 
> 
> We've found md hang in our test, it's easy to reproduce with script
> attached.
> 
> We've tried 3.4 stable kernel and latest mainline, it still exists.
> 
> Looks like flush bdi_writeback_workfn race with md_stop, no idea how to
> fix it, could you kindly give us suggestions?
> 
> Best regards,
> Jack

Thanks for the report.  I can see how that deadlock could happen.

Can you please try this patch and confirm that it fixes it.
I'm not really happy with this approach but nothing better occurs to me yet.

NeilBrown

diff --git a/drivers/md/md.c b/drivers/md/md.c
index a57b0fa..c66af69 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5144,7 +5144,7 @@ int md_run(struct mddev *mddev)
 	
 	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 	
-	if (mddev->flags)
+	if (mddev->flags & MD_UPDATE_SB_FLAGS)
 		md_update_sb(mddev, 0);
 
 	md_new_event(mddev);
@@ -5289,7 +5289,7 @@ static void __md_stop_writes(struct mddev *mddev)
 	md_super_wait(mddev);
 
 	if (mddev->ro == 0 &&
-	    (!mddev->in_sync || mddev->flags)) {
+	    (!mddev->in_sync || (mddev->flags & MD_UPDATE_SB_FLAGS))) {
 		/* mark array as shutdown cleanly */
 		mddev->in_sync = 1;
 		md_update_sb(mddev, 1);
@@ -5337,8 +5337,11 @@ static int md_set_readonly(struct mddev *mddev, struct block_device *bdev)
 		err = -EBUSY;
 		goto out;
 	}
-	if (bdev)
+	if (bdev) {
+		set_bit(MD_FINAL_FLUSH, &mddev->flags);
 		sync_blockdev(bdev);
+		clear_bit(MD_FINAL_FLUSH, &mddev->flags);
+	}
 	if (mddev->pers) {
 		__md_stop_writes(mddev);
 
@@ -5373,13 +5376,16 @@ static int do_md_stop(struct mddev * mddev, int mode,
 		mutex_unlock(&mddev->open_mutex);
 		return -EBUSY;
 	}
-	if (bdev)
+	if (bdev) {
 		/* It is possible IO was issued on some other
 		 * open file which was closed before we took ->open_mutex.
 		 * As that was not the last close __blkdev_put will not
 		 * have called sync_blockdev, so we must.
 		 */
+		set_bit(MD_FINAL_FLUSH, &mddev->flags);
 		sync_blockdev(bdev);
+		clear_bit(MD_FINAL_FLUSH, &mddev->flags);
+	}
 
 	if (mddev->pers) {
 		if (mddev->ro)
@@ -7814,7 +7820,7 @@ void md_check_recovery(struct mddev *mddev)
 				sysfs_notify_dirent_safe(mddev->sysfs_state);
 		}
 
-		if (mddev->flags)
+		if (mddev->flags & MD_UPDATE_SB_FLAGS)
 			md_update_sb(mddev, 0);
 
 		if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) &&
@@ -7904,7 +7910,10 @@ void md_check_recovery(struct mddev *mddev)
 					sysfs_notify_dirent_safe(mddev->sysfs_action);
 		}
 		mddev_unlock(mddev);
-	}
+	} else if (test_bit(MD_FINAL_FLUSH, &mddev->flags) &&
+		   mddev->in_sync == 0 &&
+		   (mddev->flags & MD_UPDATE_SB_FLAGS))
+		md_update_sb(mddev, 0);
 }
 
 void md_reap_sync_thread(struct mddev *mddev)
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 77924d3..e1e003a 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -209,7 +209,12 @@ struct mddev {
 #define MD_CHANGE_DEVS	0	/* Some device status has changed */
 #define MD_CHANGE_CLEAN 1	/* transition to or from 'clean' */
 #define MD_CHANGE_PENDING 2	/* switch from 'clean' to 'active' in progress */
+#define MD_UPDATE_SB_FLAGS (1 | 2 | 4)	/* If these are set, md_update_sb needed */
 #define MD_ARRAY_FIRST_USE 3    /* First use of array, needs initialization */
+#define MD_FINAL_FLUSH	4	/* md_check_recovery is permitted to call
+				 * md_update_sb() to switch to 'active'
+				 * without taking reconfig_mutex
+				 */
 
 	int				suspended;
 	atomic_t			active_io;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2013-08-13  4:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-12 16:33 [BUG] md hang at schedule in md_write_start Jack Wang
2013-08-13  4:31 ` NeilBrown [this message]
2013-08-13  7:42   ` Jack Wang
2013-08-14  0:44     ` NeilBrown
2013-08-14  8:09       ` Jack Wang
2013-09-10 11:09         ` Jack Wang
2013-09-10 23:54           ` NeilBrown
2013-09-11  7:40             ` Jack Wang
2013-09-11 22:59               ` NeilBrown
2013-09-12  7:55                 ` Jack Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130813143115.55dd27d1@notabene.brown \
    --to=neilb@suse.de \
    --cc=jinpu.wang@profitbricks.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=sebastian.riemer@profitbricks.com \
    --cc=xjtuwjp@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.