linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Xiao Ni <xni@redhat.com>
To: NeilBrown <neilb@suse.com>, linux-raid <linux-raid@vger.kernel.org>
Cc: shli@kernel.org
Subject: Re: Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared
Date: Sat, 30 Sep 2017 17:44:53 +0800	[thread overview]
Message-ID: <07f22731-4acb-dccf-12ba-ed4ce63c5537@redhat.com> (raw)
In-Reply-To: <87k21ec4fn.fsf@notabene.neil.brown.name>



On 09/05/2017 09:36 AM, NeilBrown wrote:
> On Mon, Sep 04 2017, Xiao Ni wrote:
>
>>
>> In function handle_stripe:
>> 4697         if (s.handle_bad_blocks ||
>> 4698             test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) {
>> 4699                 set_bit(STRIPE_HANDLE, &sh->state);
>> 4700                 goto finish;
>> 4701         }
>>
>> Because MD_SB_CHANGE_PENDING is set, so the stripes can't be handled.
>>
> Right, of course.  I see what is happening now.
>
> - raid5d cannot complete stripes until the metadata is written
> - the metadata cannot be written until raid5d gets the mddev_lock
> - mddev_lock is held by the write to suspend_hi
> - the write to suspend_hi is waiting for raid5_quiesce
> - raid5_quiesce is waiting for some stripes to complete.
>
> We could declare that ->quiesce(, 1) cannot be called while holding the
> lock.
> We could possible allow it but only if md_update_sb() is called first,
> though that might still be racy.
>
> ->quiesce(, 1) is currently called from:
>   mddev_suspend
>   suspend_lo_store
>   suspend_hi_store
>   __md_stop_writes
>   mddev_detach
>   set_bitmap_file
>   update_array_info (when setting/removing internal bitmap)
>   md_do_sync
>
> and most of those are call with the lock held, or take the lock.
>
> Maybe we should *require* that mddev_lock is held when calling
> ->quiesce() and have ->quiesce() do the metadata update.
>
> Something like the following maybe.  Can you test it?
> Thanks,
> NeilBrown
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index b01e458d31e9..999ccf08c5db 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -5805,9 +5805,11 @@ void md_stop(struct mddev *mddev)
>   	/* stop the array and free an attached data structures.
>   	 * This is called from dm-raid
>   	 */
> +	mddev_lock_nointr(mddev);
>   	__md_stop(mddev);
>   	if (mddev->bio_set)
>   		bioset_free(mddev->bio_set);
> +	mddev_unlock(mddev);
>   }
>   
>   EXPORT_SYMBOL_GPL(md_stop);
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 0fc2748aaf95..cde5a82eb404 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -4316,6 +4316,8 @@ static void handle_stripe_expansion(struct r5conf *conf, struct stripe_head *sh)
>   
>   			/* place all the copies on one channel */
>   			init_async_submit(&submit, 0, tx, NULL, NULL, NULL);
> +			WARN_ON(sh2->dev[dd_idx].page != sh2->dev[dd_idx].orig_page);
> +			WARN_ON(sh->dev[i].page != sh->dev[i].orig_page);
>   			tx = async_memcpy(sh2->dev[dd_idx].page,
>   					  sh->dev[i].page, 0, 0, STRIPE_SIZE,
>   					  &submit);
> @@ -8031,7 +8033,10 @@ static void raid5_quiesce(struct mddev *mddev, int state)
>   		wait_event_cmd(conf->wait_for_quiescent,
>   				    atomic_read(&conf->active_stripes) == 0 &&
>   				    atomic_read(&conf->active_aligned_reads) == 0,
> -				    unlock_all_device_hash_locks_irq(conf),
> +				    ({unlock_all_device_hash_locks_irq(conf);
> +					if (mddev->sb_flags)
> +						md_update_sb(mddev, 0);
> +				    }),
>   				    lock_all_device_hash_locks_irq(conf));
>   		conf->quiesce = 1;
>   		unlock_all_device_hash_locks_irq(conf);

Hi Neil

I read this patch again. But I don't know why it can't work.
It calls md_update_sb when it waits for active_stripes. It should
clear MD_SB_CHANGE_CLEAN and MD_SB_CHANGE_PENDING.

Could you explain this?

Best Regards
Xiao

      parent reply	other threads:[~2017-09-30  9:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <221835411.4473056.1504338574607.JavaMail.zimbra@redhat.com>
2017-09-02  8:01 ` Stuck in md_write_start because MD_SB_CHANGE_PENDING can't be cleared Xiao Ni
2017-09-04  2:16   ` NeilBrown
2017-09-04  2:45     ` Xiao Ni
2017-09-04  3:52       ` Xiao Ni
2017-09-04  5:34         ` NeilBrown
2017-09-04  7:36           ` Xiao Ni
2017-09-05  1:36             ` NeilBrown
2017-09-05  2:15               ` Xiao Ni
2017-09-07  1:37                 ` Xiao Ni
2017-09-07  5:37                   ` NeilBrown
2017-09-11  0:14                     ` Xiao Ni
2017-09-11  3:36                       ` NeilBrown
2017-09-11  5:03                         ` Xiao Ni
2017-09-30  9:44               ` Xiao Ni [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=07f22731-4acb-dccf-12ba-ed4ce63c5537@redhat.com \
    --to=xni@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).