From: NeilBrown <neilb@suse.com>
To: Xiao Ni <xni@redhat.com>, linux-raid <linux-raid@vger.kernel.org>
Cc: shli@kernel.org
Subject: Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared
Date: Tue, 05 Sep 2017 11:36:12 +1000 [thread overview]
Message-ID: <87k21ec4fn.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <34fedde7-cef9-34ff-1403-9d097267eb55@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 2999 bytes --]
On Mon, Sep 04 2017, Xiao Ni wrote:
>
>
> In function handle_stripe:
> 4697 if (s.handle_bad_blocks ||
> 4698 test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) {
> 4699 set_bit(STRIPE_HANDLE, &sh->state);
> 4700 goto finish;
> 4701 }
>
> Because MD_SB_CHANGE_PENDING is set, so the stripes can't be handled.
>
Right, of course. I see what is happening now.
- raid5d cannot complete stripes until the metadata is written
- the metadata cannot be written until raid5d gets the mddev_lock
- mddev_lock is held by the write to suspend_hi
- the write to suspend_hi is waiting for raid5_quiesce
- raid5_quiesce is waiting for some stripes to complete.
We could declare that ->quiesce(, 1) cannot be called while holding the
lock.
We could possible allow it but only if md_update_sb() is called first,
though that might still be racy.
->quiesce(, 1) is currently called from:
mddev_suspend
suspend_lo_store
suspend_hi_store
__md_stop_writes
mddev_detach
set_bitmap_file
update_array_info (when setting/removing internal bitmap)
md_do_sync
and most of those are call with the lock held, or take the lock.
Maybe we should *require* that mddev_lock is held when calling
->quiesce() and have ->quiesce() do the metadata update.
Something like the following maybe. Can you test it?
Thanks,
NeilBrown
diff --git a/drivers/md/md.c b/drivers/md/md.c
index b01e458d31e9..999ccf08c5db 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5805,9 +5805,11 @@ void md_stop(struct mddev *mddev)
/* stop the array and free an attached data structures.
* This is called from dm-raid
*/
+ mddev_lock_nointr(mddev);
__md_stop(mddev);
if (mddev->bio_set)
bioset_free(mddev->bio_set);
+ mddev_unlock(mddev);
}
EXPORT_SYMBOL_GPL(md_stop);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 0fc2748aaf95..cde5a82eb404 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4316,6 +4316,8 @@ static void handle_stripe_expansion(struct r5conf *conf, struct stripe_head *sh)
/* place all the copies on one channel */
init_async_submit(&submit, 0, tx, NULL, NULL, NULL);
+ WARN_ON(sh2->dev[dd_idx].page != sh2->dev[dd_idx].orig_page);
+ WARN_ON(sh->dev[i].page != sh->dev[i].orig_page);
tx = async_memcpy(sh2->dev[dd_idx].page,
sh->dev[i].page, 0, 0, STRIPE_SIZE,
&submit);
@@ -8031,7 +8033,10 @@ static void raid5_quiesce(struct mddev *mddev, int state)
wait_event_cmd(conf->wait_for_quiescent,
atomic_read(&conf->active_stripes) == 0 &&
atomic_read(&conf->active_aligned_reads) == 0,
- unlock_all_device_hash_locks_irq(conf),
+ ({unlock_all_device_hash_locks_irq(conf);
+ if (mddev->sb_flags)
+ md_update_sb(mddev, 0);
+ }),
lock_all_device_hash_locks_irq(conf));
conf->quiesce = 1;
unlock_all_device_hash_locks_irq(conf);
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2017-09-05 1:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <221835411.4473056.1504338574607.JavaMail.zimbra@redhat.com>
2017-09-02 8:01 ` Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared Xiao Ni
2017-09-04 2:16 ` NeilBrown
2017-09-04 2:45 ` Xiao Ni
2017-09-04 3:52 ` Xiao Ni
2017-09-04 5:34 ` NeilBrown
2017-09-04 7:36 ` Xiao Ni
2017-09-05 1:36 ` NeilBrown [this message]
2017-09-05 2:15 ` Xiao Ni
2017-09-07 1:37 ` Xiao Ni
2017-09-07 5:37 ` NeilBrown
2017-09-11 0:14 ` Xiao Ni
2017-09-11 3:36 ` NeilBrown
2017-09-11 5:03 ` Xiao Ni
2017-09-30 9:44 ` Xiao Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k21ec4fn.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=shli@kernel.org \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).