public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Mikulas Patocka <mpatocka@redhat.com>, Song Liu <song@kernel.org>,
	David Jeffery <djeffery@redhat.com>, Li Nan <linan122@huawei.com>
Cc: dm-devel@lists.linux.dev, linux-raid@vger.kernel.org,
	Mike Snitzer <msnitzer@redhat.com>,
	Heinz Mauelshagen <heinzm@redhat.com>,
	Benjamin Marzinski <bmarzins@redhat.com>,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH 1/7] md: Revert fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration")
Date: Thu, 18 Jan 2024 09:27:53 +0800	[thread overview]
Message-ID: <c6eb9966-8a6f-e749-3d25-b0e606149750@huaweicloud.com> (raw)
In-Reply-To: <2c29cbd4-736d-2f23-2bc-636881c150d6@redhat.com>

Hi,

在 2024/01/18 2:17, Mikulas Patocka 写道:
> The commit fa2bbff7b0b4 breaks the LVM2 test shell/integrity-caching.sh,
> so let's revert it.
> 
> sysrq: Show Blocked State
> task:lvm             state:D stack:0     pid:8275  tgid:8275  ppid:1373   flags:0x00000002
> Call Trace:
>   <TASK>
>   __schedule+0x228/0x570
>   ? __percpu_ref_switch_mode+0xb7/0x1b0
>   schedule+0x29/0xa0
>   mddev_suspend+0xec/0x1a0 [md_mod]

We really need more information about the root cause here. If
mddev_suspend() is waiting for this flush IO to be done, then why
the flush IO can't finish?

Thanks,
Kuai

>   ? housekeeping_test_cpu+0x30/0x30
>   dm_table_postsuspend_targets+0x34/0x50 [dm_mod]
>   __dm_destroy+0x1c5/0x1e0 [dm_mod]
>   ? table_clear+0xa0/0xa0 [dm_mod]
>   dev_remove+0xd4/0x110 [dm_mod]
>   ctl_ioctl+0x2e1/0x570 [dm_mod]
>   dm_ctl_ioctl+0x5/0x10 [dm_mod]
>   __x64_sys_ioctl+0x85/0xa0
>   do_syscall_64+0x5d/0x1a0
>   entry_SYSCALL_64_after_hwframe+0x46/0x4e
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Fixes: fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration")
> 
> ---
>   drivers/md/md.c |   21 ++++++---------------
>   1 file changed, 6 insertions(+), 15 deletions(-)
> 
> Index: linux-2.6/drivers/md/md.c
> ===================================================================
> --- linux-2.6.orig/drivers/md/md.c
> +++ linux-2.6/drivers/md/md.c
> @@ -543,9 +543,6 @@ static void md_end_flush(struct bio *bio
>   	rdev_dec_pending(rdev, mddev);
>   
>   	if (atomic_dec_and_test(&mddev->flush_pending)) {
> -		/* The pair is percpu_ref_get() from md_flush_request() */
> -		percpu_ref_put(&mddev->active_io);
> -
>   		/* The pre-request flush has finished */
>   		queue_work(md_wq, &mddev->flush_work);
>   	}
> @@ -565,7 +562,12 @@ static void submit_flushes(struct work_s
>   	rdev_for_each_rcu(rdev, mddev)
>   		if (rdev->raid_disk >= 0 &&
>   		    !test_bit(Faulty, &rdev->flags)) {
> +			/* Take two references, one is dropped
> +			 * when request finishes, one after
> +			 * we reclaim rcu_read_lock
> +			 */
>   			struct bio *bi;
> +			atomic_inc(&rdev->nr_pending);
>   
>   			atomic_inc(&rdev->nr_pending);
>   			rcu_read_unlock();
> @@ -577,6 +579,7 @@ static void submit_flushes(struct work_s
>   			atomic_inc(&mddev->flush_pending);
>   			submit_bio(bi);
>   			rcu_read_lock();
> +			rdev_dec_pending(rdev, mddev);
>   		}
>   	rcu_read_unlock();
>   	if (atomic_dec_and_test(&mddev->flush_pending))
> @@ -629,18 +632,6 @@ bool md_flush_request(struct mddev *mdde
>   	/* new request after previous flush is completed */
>   	if (ktime_after(req_start, mddev->prev_flush_start)) {
>   		WARN_ON(mddev->flush_bio);
> -		/*
> -		 * Grab a reference to make sure mddev_suspend() will wait for
> -		 * this flush to be done.
> -		 *
> -		 * md_flush_reqeust() is called under md_handle_request() and
> -		 * 'active_io' is already grabbed, hence percpu_ref_is_zero()
> -		 * won't pass, percpu_ref_tryget_live() can't be used because
> -		 * percpu_ref_kill() can be called by mddev_suspend()
> -		 * concurrently.
> -		 */
> -		WARN_ON(percpu_ref_is_zero(&mddev->active_io));
> -		percpu_ref_get(&mddev->active_io);
>   		mddev->flush_bio = bio;
>   		bio = NULL;
>   	}
> 
> .
> 


  reply	other threads:[~2024-01-18  1:28 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-17 18:16 [PATCH 0/7] MD fixes for the LVM2 testsuite Mikulas Patocka
2024-01-17 18:17 ` [PATCH 1/7] md: Revert fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration") Mikulas Patocka
2024-01-18  1:27   ` Yu Kuai [this message]
2024-01-17 18:18 ` [PATCH 2/7] md: fix a race condition when stopping the sync thread Mikulas Patocka
2024-01-18  1:32   ` Yu Kuai
2024-01-18 13:07     ` Mikulas Patocka
2024-01-18 13:20       ` Yu Kuai
2024-01-18 13:28         ` Mikulas Patocka
2024-01-17 18:19 ` [PATCH 3/7] md: test for MD_RECOVERY_DONE in stop_sync_thread Mikulas Patocka
2024-01-18  0:19   ` Song Liu
2024-01-18 13:23     ` Mikulas Patocka
2024-01-18 21:10       ` Song Liu
2024-01-22 16:34         ` Mikulas Patocka
2024-01-23  2:31           ` Benjamin Marzinski
2024-01-26  9:17             ` Yu Kuai
2024-01-26  9:37               ` Yu Kuai
2024-01-26 10:29                 ` Zdenek Kabelac
2024-01-27  1:13                   ` Yu Kuai
2024-01-27  1:19                     ` Yu Kuai
2024-01-18  1:35   ` Yu Kuai
2024-01-17 18:20 ` [PATCH 4/7] md: call md_reap_sync_thread from __md_stop_writes Mikulas Patocka
2024-01-18  1:38   ` Yu Kuai
2024-01-17 18:21 ` [PATCH 5/7] md: fix deadlock in shell/lvconvert-raid-reshape-linear_to_raid6-single-type.sh Mikulas Patocka
2024-01-18  1:12   ` Song Liu
2024-01-18  1:51   ` Yu Kuai
2024-01-17 18:22 ` [PATCH 6/7] md: partially revert "md/raid6: use valid sector values to determine if an I/O should wait on the reshape" Mikulas Patocka
2024-01-17 23:56   ` Song Liu
2024-01-17 18:22 ` [PATCH 7/7] md: fix a suspicious RCU usage warning Mikulas Patocka
2024-01-17 23:59   ` Song Liu
2024-01-18  1:56   ` Yu Kuai
2024-01-25 17:31     ` Song Liu
2024-01-17 19:27 ` [PATCH 0/7] MD fixes for the LVM2 testsuite Song Liu
2024-01-18  2:03   ` Yu Kuai
2024-01-27  7:57 ` Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6eb9966-8a6f-e749-3d25-b0e606149750@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=bmarzins@redhat.com \
    --cc=djeffery@redhat.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=heinzm@redhat.com \
    --cc=linan122@huawei.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=msnitzer@redhat.com \
    --cc=song@kernel.org \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox