[PATCH v4 1/7] md: Make md resync and reshape threads freezable

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 1/7] md: Make md resync and reshape threads freezable
       [not found] <20170925202924.16603-1-bart.vanassche@wdc.com>
@ 2017-09-25 20:29 ` Bart Van Assche
  2017-09-25 23:04   ` Ming Lei
                     ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Bart Van Assche @ 2017-09-25 20:29 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	=Oleksandr Natalenko, Bart Van Assche, Shaohua Li, linux-raid,
	Ming Lei, Hannes Reinecke, Johannes Thumshirn

Some people use the md driver on laptops and use the suspend and
resume functionality. Since it is essential that submitting of
new I/O requests stops before device quiescing starts, make the
md resync and reshape threads freezable.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/md/md.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 08fcaebc61bd..26a12bd0db65 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -66,6 +66,7 @@
 #include <linux/raid/md_u.h>
 #include <linux/slab.h>
 #include <linux/percpu-refcount.h>
+#include <linux/freezer.h>
 
 #include <trace/events/block.h>
 #include "md.h"
@@ -7424,6 +7425,7 @@ static int md_thread(void *arg)
 	 */
 
 	allow_signal(SIGKILL);
+	set_freezable();
 	while (!kthread_should_stop()) {
 
 		/* We need to wait INTERRUPTIBLE so that
@@ -7434,7 +7436,7 @@ static int md_thread(void *arg)
 		if (signal_pending(current))
 			flush_signals(current);
 
-		wait_event_interruptible_timeout
+		wait_event_freezable_timeout
 			(thread->wqueue,
 			 test_bit(THREAD_WAKEUP, &thread->flags)
 			 || kthread_should_stop() || kthread_should_park(),
@@ -8133,6 +8135,8 @@ void md_do_sync(struct md_thread *thread)
 		return;
 	}
 
+	set_freezable();
+
 	if (mddev_is_clustered(mddev)) {
 		ret = md_cluster_ops->resync_start(mddev);
 		if (ret)
@@ -8324,7 +8328,7 @@ void md_do_sync(struct md_thread *thread)
 		     mddev->curr_resync_completed > mddev->resync_max
 			    )) {
 			/* time to update curr_resync_completed */
-			wait_event(mddev->recovery_wait,
+			wait_event_freezable(mddev->recovery_wait,
 				   atomic_read(&mddev->recovery_active) == 0);
 			mddev->curr_resync_completed = j;
 			if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) &&
@@ -8342,10 +8346,10 @@ void md_do_sync(struct md_thread *thread)
 			 * to avoid triggering warnings.
 			 */
 			flush_signals(current); /* just in case */
-			wait_event_interruptible(mddev->recovery_wait,
-						 mddev->resync_max > j
-						 || test_bit(MD_RECOVERY_INTR,
-							     &mddev->recovery));
+			wait_event_freezable(mddev->recovery_wait,
+					     mddev->resync_max > j ||
+					     test_bit(MD_RECOVERY_INTR,
+						      &mddev->recovery));
 		}
 
 		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
@@ -8421,7 +8425,7 @@ void md_do_sync(struct md_thread *thread)
 				 * Give other IO more of a chance.
 				 * The faster the devices, the less we wait.
 				 */
-				wait_event(mddev->recovery_wait,
+				wait_event_freezable(mddev->recovery_wait,
 					   !atomic_read(&mddev->recovery_active));
 			}
 		}
@@ -8433,7 +8437,8 @@ void md_do_sync(struct md_thread *thread)
 	 * this also signals 'finished resyncing' to md_stop
 	 */
 	blk_finish_plug(&plug);
-	wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active));
+	wait_event_freezable(mddev->recovery_wait,
+			     !atomic_read(&mddev->recovery_active));
 
 	if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
 	    !test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-25 20:29 ` [PATCH v4 1/7] md: Make md resync and reshape threads freezable Bart Van Assche
@ 2017-09-25 23:04   ` Ming Lei
  2017-09-25 23:09     ` Bart Van Assche
  2017-09-26  6:06   ` Hannes Reinecke
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2017-09-25 23:04 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Martin K . Petersen,
	=Oleksandr Natalenko, Shaohua Li, linux-raid, Hannes Reinecke,
	Johannes Thumshirn

On Mon, Sep 25, 2017 at 01:29:18PM -0700, Bart Van Assche wrote:
> Some people use the md driver on laptops and use the suspend and
> resume functionality. Since it is essential that submitting of
> new I/O requests stops before device quiescing starts, make the
> md resync and reshape threads freezable.

As I explained, if SCSI quiesce is safe, this patch shouldn't
be needed.

The issue isn't MD specific, and in theory can be triggered
on all devices. And you can see the I/O hang report on BTRFS(RAID)
without MD involved:

	https://marc.info/?l=linux-block&m=150634883816965&w=2

So please remove this patch from the patchset.

--
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-25 23:04   ` Ming Lei
@ 2017-09-25 23:09     ` Bart Van Assche
  2017-09-26  4:01       ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2017-09-25 23:09 UTC (permalink / raw)
  To: ming.lei@redhat.com
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Tue, 2017-09-26 at 07:04 +0800, Ming Lei wrote:
> On Mon, Sep 25, 2017 at 01:29:18PM -0700, Bart Van Assche wrote:
> > Some people use the md driver on laptops and use the suspend and
> > resume functionality. Since it is essential that submitting of
> > new I/O requests stops before device quiescing starts, make the
> > md resync and reshape threads freezable.
> 
> As I explained, if SCSI quiesce is safe, this patch shouldn't
> be needed.
> 
> The issue isn't MD specific, and in theory can be triggered
> on all devices. And you can see the I/O hang report on BTRFS(RAID)
> without MD involved:
> 
> 	https://marc.info/?l=linux-block&m=150634883816965&w=2

What makes you think that this patch is not necessary once SCSI quiesce
has been made safe? Does this mean that you have not tested suspend and
resume while md RAID 1 resync was in progress? This patch is necessary
to avoid that suspend locks up while md RAID 1 resync is in progress.

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-25 23:09     ` Bart Van Assche
@ 2017-09-26  4:01       ` Ming Lei
  2017-09-26  8:13         ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2017-09-26  4:01 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Mon, Sep 25, 2017 at 11:09:15PM +0000, Bart Van Assche wrote:
> On Tue, 2017-09-26 at 07:04 +0800, Ming Lei wrote:
> > On Mon, Sep 25, 2017 at 01:29:18PM -0700, Bart Van Assche wrote:
> > > Some people use the md driver on laptops and use the suspend and
> > > resume functionality. Since it is essential that submitting of
> > > new I/O requests stops before device quiescing starts, make the
> > > md resync and reshape threads freezable.
> > 
> > As I explained, if SCSI quiesce is safe, this patch shouldn't
> > be needed.
> > 
> > The issue isn't MD specific, and in theory can be triggered
> > on all devices. And you can see the I/O hang report on BTRFS(RAID)
> > without MD involved:
> > 
> > 	https://marc.info/?l=linux-block&m=150634883816965&w=2
> 
> What makes you think that this patch is not necessary once SCSI quiesce
> has been made safe? Does this mean that you have not tested suspend and

If we want to make SCSI quiesce safe, we have to drain up all submitted
I/O and prevent new I/O from being submitted, that is enough to deal
with MD's resync too.

> resume while md RAID 1 resync was in progress? This patch is necessary
> to avoid that suspend locks up while md RAID 1 resync is in progress.

I tested my patchset on RAID10 when resync in progress, not see any
issue during suspend/resume, without any MD's change. I will test
RAID1's later, but I don't think there is difference compared with
RAID10 because my patchset can make the queue being quiesced totally.

The upstream report did not mention that the resync is in progress:

	https://marc.info/?l=linux-block&m=150402198315951&w=2

I want to make the patchset(make SCSI quiesce safe) focusing on this
SCSI quiesce issue.

Maybe with this MD patch, the original report becomes quite difficult
to reproduce, that doesn't mean this MD patch fixes this kind of
issue. What I really cares is that your changes on BLOCK/SCSI can
fix this issue.

-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-25 20:29 ` [PATCH v4 1/7] md: Make md resync and reshape threads freezable Bart Van Assche
  2017-09-25 23:04   ` Ming Lei
@ 2017-09-26  6:06   ` Hannes Reinecke
  2017-09-26 11:17   ` Ming Lei
  2017-10-02 13:26   ` Christoph Hellwig
  3 siblings, 0 replies; 15+ messages in thread
From: Hannes Reinecke @ 2017-09-26  6:06 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	=Oleksandr Natalenko, Shaohua Li, linux-raid, Ming Lei,
	Hannes Reinecke, Johannes Thumshirn

On 09/25/2017 10:29 PM, Bart Van Assche wrote:
> Some people use the md driver on laptops and use the suspend and
> resume functionality. Since it is essential that submitting of
> new I/O requests stops before device quiescing starts, make the
> md resync and reshape threads freezable.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: linux-raid@vger.kernel.org
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/md/md.c | 21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-26  4:01       ` Ming Lei
@ 2017-09-26  8:13         ` Ming Lei
  2017-09-26 14:40           ` Bart Van Assche
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2017-09-26  8:13 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Tue, Sep 26, 2017 at 12:01:03PM +0800, Ming Lei wrote:
> On Mon, Sep 25, 2017 at 11:09:15PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-09-26 at 07:04 +0800, Ming Lei wrote:
> > > On Mon, Sep 25, 2017 at 01:29:18PM -0700, Bart Van Assche wrote:
> > > > Some people use the md driver on laptops and use the suspend and
> > > > resume functionality. Since it is essential that submitting of
> > > > new I/O requests stops before device quiescing starts, make the
> > > > md resync and reshape threads freezable.
> > > 
> > > As I explained, if SCSI quiesce is safe, this patch shouldn't
> > > be needed.
> > > 
> > > The issue isn't MD specific, and in theory can be triggered
> > > on all devices. And you can see the I/O hang report on BTRFS(RAID)
> > > without MD involved:
> > > 
> > > 	https://marc.info/?l=linux-block&m=150634883816965&w=2
> > 
> > What makes you think that this patch is not necessary once SCSI quiesce
> > has been made safe? Does this mean that you have not tested suspend and
> 
> If we want to make SCSI quiesce safe, we have to drain up all submitted
> I/O and prevent new I/O from being submitted, that is enough to deal
> with MD's resync too.
> 
> > resume while md RAID 1 resync was in progress? This patch is necessary
> > to avoid that suspend locks up while md RAID 1 resync is in progress.
> 
> I tested my patchset on RAID10 when resync in progress, not see any
> issue during suspend/resume, without any MD's change. I will test
> RAID1's later, but I don't think there is difference compared with
> RAID10 because my patchset can make the queue being quiesced totally.

I am pretty sure that suspend/resume can survive when resync in progress
with my patchset applied on RAID1, without any MD change.

There are reports on suspend/resume on btrfs(RAID) and revalidate path
in scsi_transport_spi device, so the issue isn't MD specific again.

If your patchset depends on this MD change, something should be wrong
in the following patches. Now I need to take a close look.


-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-25 20:29 ` [PATCH v4 1/7] md: Make md resync and reshape threads freezable Bart Van Assche
  2017-09-25 23:04   ` Ming Lei
  2017-09-26  6:06   ` Hannes Reinecke
@ 2017-09-26 11:17   ` Ming Lei
  2017-09-26 14:42     ` Bart Van Assche
  2017-10-02 13:26   ` Christoph Hellwig
  3 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2017-09-26 11:17 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Martin K . Petersen,
	=Oleksandr Natalenko, Shaohua Li, linux-raid, Hannes Reinecke,
	Johannes Thumshirn

On Mon, Sep 25, 2017 at 01:29:18PM -0700, Bart Van Assche wrote:
> Some people use the md driver on laptops and use the suspend and
> resume functionality. Since it is essential that submitting of
> new I/O requests stops before device quiescing starts, make the
> md resync and reshape threads freezable.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: linux-raid@vger.kernel.org
> Cc: Ming Lei <ming.lei@redhat.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/md/md.c | 21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 08fcaebc61bd..26a12bd0db65 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -66,6 +66,7 @@
>  #include <linux/raid/md_u.h>
>  #include <linux/slab.h>
>  #include <linux/percpu-refcount.h>
> +#include <linux/freezer.h>
>  
>  #include <trace/events/block.h>
>  #include "md.h"
> @@ -7424,6 +7425,7 @@ static int md_thread(void *arg)
>  	 */
>  
>  	allow_signal(SIGKILL);
> +	set_freezable();
>  	while (!kthread_should_stop()) {
>  
>  		/* We need to wait INTERRUPTIBLE so that
> @@ -7434,7 +7436,7 @@ static int md_thread(void *arg)
>  		if (signal_pending(current))
>  			flush_signals(current);
>  
> -		wait_event_interruptible_timeout
> +		wait_event_freezable_timeout
>  			(thread->wqueue,
>  			 test_bit(THREAD_WAKEUP, &thread->flags)
>  			 || kthread_should_stop() || kthread_should_park(),
> @@ -8133,6 +8135,8 @@ void md_do_sync(struct md_thread *thread)
>  		return;
>  	}
>  
> +	set_freezable();
> +
>  	if (mddev_is_clustered(mddev)) {
>  		ret = md_cluster_ops->resync_start(mddev);
>  		if (ret)
> @@ -8324,7 +8328,7 @@ void md_do_sync(struct md_thread *thread)
>  		     mddev->curr_resync_completed > mddev->resync_max
>  			    )) {
>  			/* time to update curr_resync_completed */
> -			wait_event(mddev->recovery_wait,
> +			wait_event_freezable(mddev->recovery_wait,
>  				   atomic_read(&mddev->recovery_active) == 0);
>  			mddev->curr_resync_completed = j;
>  			if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) &&
> @@ -8342,10 +8346,10 @@ void md_do_sync(struct md_thread *thread)
>  			 * to avoid triggering warnings.
>  			 */
>  			flush_signals(current); /* just in case */
> -			wait_event_interruptible(mddev->recovery_wait,
> -						 mddev->resync_max > j
> -						 || test_bit(MD_RECOVERY_INTR,
> -							     &mddev->recovery));
> +			wait_event_freezable(mddev->recovery_wait,
> +					     mddev->resync_max > j ||
> +					     test_bit(MD_RECOVERY_INTR,
> +						      &mddev->recovery));
>  		}
>  
>  		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
> @@ -8421,7 +8425,7 @@ void md_do_sync(struct md_thread *thread)
>  				 * Give other IO more of a chance.
>  				 * The faster the devices, the less we wait.
>  				 */
> -				wait_event(mddev->recovery_wait,
> +				wait_event_freezable(mddev->recovery_wait,
>  					   !atomic_read(&mddev->recovery_active));
>  			}
>  		}
> @@ -8433,7 +8437,8 @@ void md_do_sync(struct md_thread *thread)
>  	 * this also signals 'finished resyncing' to md_stop
>  	 */
>  	blk_finish_plug(&plug);
> -	wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active));
> +	wait_event_freezable(mddev->recovery_wait,
> +			     !atomic_read(&mddev->recovery_active));
>  
>  	if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
>  	    !test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
> -- 
> 2.14.1
> 

Just test this patch a bit and the following failure of freezing task
is triggered during suspend:

[   38.903513] PM: suspend entry (deep)
[   38.904443] PM: Syncing filesystems ... done.
[   38.983591] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   38.987522] OOM killer disabled.
[   38.987962] Freezing remaining freezable tasks ...
[   58.998872] Freezing of tasks failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0):
[   59.002539] md127_resync    D    0  1618      2 0x80000000
[   59.004954] Call Trace:
[   59.006162]  __schedule+0x41f/0xa50
[   59.007704]  schedule+0x3d/0x90
[   59.009305]  raid1_sync_request+0x2da/0xd10 [raid1]
[   59.011505]  ? remove_wait_queue+0x70/0x70
[   59.013352]  md_do_sync+0xdfa/0x12c0
[   59.014955]  ? remove_wait_queue+0x70/0x70
[   59.016336]  md_thread+0x1a8/0x1e0
[   59.016770]  ? md_thread+0x1a8/0x1e0
[   59.017250]  kthread+0x155/0x190
[   59.017662]  ? sync_speed_show+0xa0/0xa0
[   59.018217]  ? kthread_create_on_node+0x70/0x70
[   59.018858]  ret_from_fork+0x2a/0x40
[   59.019403] Restarting kernel threads ... done.
[   59.024586] OOM killer enabled.
[   59.025124] Restarting tasks ... done.
[   59.045906] PM: suspend exit
[   97.919428] systemd-journald[227]: Sent WATCHDOG=1 notification.
[  101.002695] md: md127: data-check done.



-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-26  8:13         ` Ming Lei
@ 2017-09-26 14:40           ` Bart Van Assche
  2017-09-26 15:02             ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2017-09-26 14:40 UTC (permalink / raw)
  To: ming.lei@redhat.com
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Tue, 2017-09-26 at 16:13 +0800, Ming Lei wrote:
> I am pretty sure that suspend/resume can survive when resync in progress
> with my patchset applied on RAID1, without any MD change.

The above shows that you do not understand how suspend and resume works.
In the documents in the directory Documentation/power it is explained clearly
that I/O must be stopped before the hibernation image is generation to avoid
hard to repair filesystem corruption. Although md is not a filesystem I think
this also applies to md since md keeps some state information in RAM and some
state information on disk. It is essential that all that state information is
in consistent.

> If your patchset depends on this MD change, something should be wrong
> in the following patches. Now I need to take a close look.

The later patches that make SCSI quiesce and resume safe do not depend on
this change.

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-26 11:17   ` Ming Lei
@ 2017-09-26 14:42     ` Bart Van Assche
  2017-09-26 14:59       ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2017-09-26 14:42 UTC (permalink / raw)
  To: ming.lei@redhat.com
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Tue, 2017-09-26 at 19:17 +0800, Ming Lei wrote:
> Just test this patch a bit and the following failure of freezing task
> is triggered during suspend: [ ... ]

What kernel version did you start from and which patches were applied on top of
that kernel? Only patch 1/7 or all seven patches? What storage configuration did
you use in your test and what command(s) did you use to trigger suspend?

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-26 14:42     ` Bart Van Assche
@ 2017-09-26 14:59       ` Ming Lei
  2017-09-27  3:12         ` Bart Van Assche
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2017-09-26 14:59 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Tue, Sep 26, 2017 at 02:42:07PM +0000, Bart Van Assche wrote:
> On Tue, 2017-09-26 at 19:17 +0800, Ming Lei wrote:
> > Just test this patch a bit and the following failure of freezing task
> > is triggered during suspend: [ ... ]
> 
> What kernel version did you start from and which patches were applied on top of
> that kernel? Only patch 1/7 or all seven patches? What storage configuration did

It is v4.14-rc1+, and top commit is 8d93c7a43157, with all your 7 patches
applied.

> you use in your test and what command(s) did you use to trigger suspend?

Follows my pm test script:

	#!/bin/sh
	
	echo check > /sys/block/md127/md/sync_action
	
	mkfs.ext4 -F /dev/md127
	
	mount /dev/md0 /mnt/data
	
	dd if=/dev/zero of=/mnt/data/d1.img bs=4k count=128k&
	
	echo 9 > /proc/sys/kernel/printk
	echo devices > /sys/power/pm_test
	echo mem > /sys/power/state
	
	wait
	umount /mnt/data

Storage setting:

	sudo mdadm --create /dev/md/test /dev/sda /dev/sdb --level=1 --raid-devices=2
	both /dev/sda and /dev/sdb are virtio-scsi.


-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-26 14:40           ` Bart Van Assche
@ 2017-09-26 15:02             ` Ming Lei
  0 siblings, 0 replies; 15+ messages in thread
From: Ming Lei @ 2017-09-26 15:02 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Tue, Sep 26, 2017 at 02:40:36PM +0000, Bart Van Assche wrote:
> On Tue, 2017-09-26 at 16:13 +0800, Ming Lei wrote:
> > I am pretty sure that suspend/resume can survive when resync in progress
> > with my patchset applied on RAID1, without any MD change.
> 
> The above shows that you do not understand how suspend and resume works.
> In the documents in the directory Documentation/power it is explained clearly
> that I/O must be stopped before the hibernation image is generation to avoid

No, I don't use hibernation, and just use suspend/resume(s2r).

> hard to repair filesystem corruption. Although md is not a filesystem I think
> this also applies to md since md keeps some state information in RAM and some
> state information on disk. It is essential that all that state information is
> in consistent.
> 
> > If your patchset depends on this MD change, something should be wrong
> > in the following patches. Now I need to take a close look.
> 
> The later patches that make SCSI quiesce and resume safe do not depend on
> this change.

Are you sure?

If I remove the 1st patch, system suspend/resume will hang with all your
other 6 patchset.


-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-26 14:59       ` Ming Lei
@ 2017-09-27  3:12         ` Bart Van Assche
  2017-09-27 11:00           ` Ming Lei
  0 siblings, 1 reply; 15+ messages in thread
From: Bart Van Assche @ 2017-09-27  3:12 UTC (permalink / raw)
  To: ming.lei@redhat.com
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Tue, 2017-09-26 at 22:59 +0800, Ming Lei wrote:
> On Tue, Sep 26, 2017 at 02:42:07PM +0000, Bart Van Assche wrote:
> > On Tue, 2017-09-26 at 19:17 +0800, Ming Lei wrote:
> > > Just test this patch a bit and the following failure of freezing task
> > > is triggered during suspend: [ ... ]
> > 
> > What kernel version did you start from and which patches were applied on top of
> > that kernel? Only patch 1/7 or all seven patches? What storage configuration did
> 
> It is v4.14-rc1+, and top commit is 8d93c7a43157, with all your 7 patches
> applied.
> 
> > you use in your test and what command(s) did you use to trigger suspend?
> 
> Follows my pm test script:
> 
> 	#!/bin/sh
> 	
> 	echo check > /sys/block/md127/md/sync_action
> 	
> 	mkfs.ext4 -F /dev/md127
> 	
> 	mount /dev/md0 /mnt/data
> 	
> 	dd if=/dev/zero of=/mnt/data/d1.img bs=4k count=128k&
> 	
> 	echo 9 > /proc/sys/kernel/printk
> 	echo devices > /sys/power/pm_test
> 	echo mem > /sys/power/state
> 	
> 	wait
> 	umount /mnt/data
> 
> Storage setting:
> 
> 	sudo mdadm --create /dev/md/test /dev/sda /dev/sdb --level=1 --raid-devices=2
> 	both /dev/sda and /dev/sdb are virtio-scsi.

Thanks for the detailed reply. I have been able to reproduce the freeze failure
you reported. The output of SysRq-t learned me that the md reboot notifier was
waiting for the frozen md sync thread and that this caused the freeze failure. So
I have started testing the patch below instead of the patch at the start of this
e-mail thread:


Subject: [PATCH] md: Stop resync and reshape upon system freeze

Some people use the md driver on laptops and use the suspend and
resume functionality. Since it is essential that submitting of
new I/O requests stops before a hibernation image is created,
interrupt the md resync and reshape actions if the system is
being frozen. Note: the resync and reshape will restart after
the system is resumed and a message similar to the following
will appear in the system log:

md: md0: data-check interrupted.

---
 drivers/md/md.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 08fcaebc61bd..1e9d50f7345e 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -66,6 +66,7 @@
 #include <linux/raid/md_u.h>
 #include <linux/slab.h>
 #include <linux/percpu-refcount.h>
+#include <linux/freezer.h>
 
 #include <trace/events/block.h>
 #include "md.h"
@@ -8103,6 +8104,12 @@ void md_allow_write(struct mddev *mddev)
 }
 EXPORT_SYMBOL_GPL(md_allow_write);
 
+static bool md_sync_interrupted(struct mddev *mddev)
+{
+	return test_bit(MD_RECOVERY_INTR, &mddev->recovery) ||
+		freezing(current);
+}
+
 #define SYNC_MARKS	10
 #define	SYNC_MARK_STEP	(3*HZ)
 #define UPDATE_FREQUENCY (5*60*HZ)
@@ -8133,6 +8140,8 @@ void md_do_sync(struct md_thread *thread)
 		return;
 	}
 
+	set_freezable();
+
 	if (mddev_is_clustered(mddev)) {
 		ret = md_cluster_ops->resync_start(mddev);
 		if (ret)
@@ -8184,7 +8193,7 @@ void md_do_sync(struct md_thread *thread)
 		mddev->curr_resync = 2;
 
 	try_again:
-		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
+		if (md_sync_interrupted(mddev))
 			goto skip;
 		for_each_mddev(mddev2, tmp) {
 			if (mddev2 == mddev)
@@ -8208,7 +8217,7 @@ void md_do_sync(struct md_thread *thread)
 				 * be caught by 'softlockup'
 				 */
 				prepare_to_wait(&resync_wait, &wq, TASK_INTERRUPTIBLE);
-				if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
+				if (!md_sync_interrupted(mddev) &&
 				    mddev2->curr_resync >= mddev->curr_resync) {
 					if (mddev2_minor != mddev2->md_minor) {
 						mddev2_minor = mddev2->md_minor;
@@ -8335,8 +8344,7 @@ void md_do_sync(struct md_thread *thread)
 			sysfs_notify(&mddev->kobj, NULL, "sync_completed");
 		}
 
-		while (j >= mddev->resync_max &&
-		       !test_bit(MD_RECOVERY_INTR, &mddev->recovery)) {
+		while (j >= mddev->resync_max && !md_sync_interrupted(mddev)) {
 			/* As this condition is controlled by user-space,
 			 * we can block indefinitely, so use '_interruptible'
 			 * to avoid triggering warnings.
@@ -8348,7 +8356,7 @@ void md_do_sync(struct md_thread *thread)
 							     &mddev->recovery));
 		}
 
-		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
+		if (md_sync_interrupted(mddev))
 			break;
 
 		sectors = mddev->pers->sync_request(mddev, j, &skipped);
@@ -8362,7 +8370,7 @@ void md_do_sync(struct md_thread *thread)
 			atomic_add(sectors, &mddev->recovery_active);
 		}
 
-		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
+		if (md_sync_interrupted(mddev))
 			break;
 
 		j += sectors;
@@ -8394,7 +8402,7 @@ void md_do_sync(struct md_thread *thread)
 			last_mark = next;
 		}
 
-		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
+		if (md_sync_interrupted(mddev))
 			break;
 
 		/*
@@ -8427,8 +8435,7 @@ void md_do_sync(struct md_thread *thread)
 		}
 	}
 	pr_info("md: %s: %s %s.\n",mdname(mddev), desc,
-		test_bit(MD_RECOVERY_INTR, &mddev->recovery)
-		? "interrupted" : "done");
+		md_sync_interrupted(mddev) ? "interrupted" : "done");
 	/*
 	 * this also signals 'finished resyncing' to md_stop
 	 */
@@ -8436,8 +8443,7 @@ void md_do_sync(struct md_thread *thread)
 	wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active));
 
 	if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
-	    !test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
-	    mddev->curr_resync > 3) {
+	    !md_sync_interrupted(mddev) && mddev->curr_resync > 3) {
 		mddev->curr_resync_completed = mddev->curr_resync;
 		sysfs_notify(&mddev->kobj, NULL, "sync_completed");
 	}
@@ -8446,7 +8452,7 @@ void md_do_sync(struct md_thread *thread)
 	if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) &&
 	    mddev->curr_resync > 3) {
 		if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
-			if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) {
+			if (md_sync_interrupted(mddev)) {
 				if (mddev->curr_resync >= mddev->recovery_cp) {
 					pr_debug("md: checkpointing %s of %s.\n",
 						 desc, mdname(mddev));
@@ -8461,7 +8467,7 @@ void md_do_sync(struct md_thread *thread)
 			} else
 				mddev->recovery_cp = MaxSector;
 		} else {
-			if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery))
+			if (!md_sync_interrupted(mddev))
 				mddev->curr_resync = MaxSector;
 			rcu_read_lock();
 			rdev_for_each_rcu(rdev, mddev)
@@ -8483,7 +8489,7 @@ void md_do_sync(struct md_thread *thread)
 		      BIT(MD_SB_CHANGE_PENDING) | BIT(MD_SB_CHANGE_DEVS));
 
 	spin_lock(&mddev->lock);
-	if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) {
+	if (!md_sync_interrupted(mddev)) {
 		/* We completed so min/max setting can be forgotten if used. */
 		if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery))
 			mddev->resync_min = 0;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-27  3:12         ` Bart Van Assche
@ 2017-09-27 11:00           ` Ming Lei
  2017-10-02 15:39             ` Bart Van Assche
  0 siblings, 1 reply; 15+ messages in thread
From: Ming Lei @ 2017-09-27 11:00 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Wed, Sep 27, 2017 at 03:12:47AM +0000, Bart Van Assche wrote:
> On Tue, 2017-09-26 at 22:59 +0800, Ming Lei wrote:
> > On Tue, Sep 26, 2017 at 02:42:07PM +0000, Bart Van Assche wrote:
> > > On Tue, 2017-09-26 at 19:17 +0800, Ming Lei wrote:
> > > > Just test this patch a bit and the following failure of freezing task
> > > > is triggered during suspend: [ ... ]
> > > 
> > > What kernel version did you start from and which patches were applied on top of
> > > that kernel? Only patch 1/7 or all seven patches? What storage configuration did
> > 
> > It is v4.14-rc1+, and top commit is 8d93c7a43157, with all your 7 patches
> > applied.
> > 
> > > you use in your test and what command(s) did you use to trigger suspend?
> > 
> > Follows my pm test script:
> > 
> > 	#!/bin/sh
> > 	
> > 	echo check > /sys/block/md127/md/sync_action
> > 	
> > 	mkfs.ext4 -F /dev/md127
> > 	
> > 	mount /dev/md0 /mnt/data
> > 	
> > 	dd if=/dev/zero of=/mnt/data/d1.img bs=4k count=128k&
> > 	
> > 	echo 9 > /proc/sys/kernel/printk
> > 	echo devices > /sys/power/pm_test
> > 	echo mem > /sys/power/state
> > 	
> > 	wait
> > 	umount /mnt/data
> > 
> > Storage setting:
> > 
> > 	sudo mdadm --create /dev/md/test /dev/sda /dev/sdb --level=1 --raid-devices=2
> > 	both /dev/sda and /dev/sdb are virtio-scsi.
> 
> Thanks for the detailed reply. I have been able to reproduce the freeze failure
> you reported. The output of SysRq-t learned me that the md reboot notifier was
> waiting for the frozen md sync thread and that this caused the freeze failure. So
> I have started testing the patch below instead of the patch at the start of this
> e-mail thread:

OK, thanks for figuring out the reason.

With current linus tree, SCSI I/O is prevented from being dispatch to
device during suspend by SCSI quiesce, and will be dispatched again
in resume. With Safe SCSI quiesce[1], any underlying IO request will
stop at blk_queue_enter() during suspend, and restart from there during
resume.

For other non-SCSI driver, their .suspend/.resume often
handles I/O in similar way, for example, NVMe queue will
be frozen in .suspend, and unfreeze in .resume.

So could you explain a bit which kind of bug this patch fixes?

[1] https://marc.info/?l=linux-block&m=150649136519281&w=2


-- 
Ming

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-25 20:29 ` [PATCH v4 1/7] md: Make md resync and reshape threads freezable Bart Van Assche
                     ` (2 preceding siblings ...)
  2017-09-26 11:17   ` Ming Lei
@ 2017-10-02 13:26   ` Christoph Hellwig
  3 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2017-10-02 13:26 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Martin K . Petersen,
	=Oleksandr Natalenko, Shaohua Li, linux-raid, Ming Lei,
	Hannes Reinecke, Johannes Thumshirn

Independent of whether this should be required to make scsi quience
safe or not making the dm thread freeze aware is the right thing
to do, and this patch looks correct to me:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/7] md: Make md resync and reshape threads freezable
  2017-09-27 11:00           ` Ming Lei
@ 2017-10-02 15:39             ` Bart Van Assche
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Van Assche @ 2017-10-02 15:39 UTC (permalink / raw)
  To: ming.lei@redhat.com
  Cc: linux-block@vger.kernel.org, jthumshirn@suse.de,
	linux-raid@vger.kernel.org, hch@lst.de,
	martin.petersen@oracle.com, axboe@kernel.dk,
	oleksandr@natalenko.name, hare@suse.com, shli@kernel.org

On Wed, 2017-09-27 at 19:00 +0800, Ming Lei wrote:
> With current linus tree, SCSI I/O is prevented from being dispatch to
> device during suspend by SCSI quiesce, and will be dispatched again
> in resume. With Safe SCSI quiesce[1], any underlying IO request will
> stop at blk_queue_enter() during suspend, and restart from there during
> resume.
> 
> For other non-SCSI driver, their .suspend/.resume often
> handles I/O in similar way, for example, NVMe queue will
> be frozen in .suspend, and unfreeze in .resume.
> 
> So could you explain a bit which kind of bug this patch fixes?

It seems like you do not fully understand the motivation behind quiescing and
resuming I/O from inside the SCSI and NVMe freeze and thaw power management
callbacks. Hibernation can only work reliably if no I/O is submitted after
creation of the hibernation image has been started. If any I/O is submitted
that is not related to the hibernation process itself by any driver after
user space processes and kernel threads have been frozen then that's a *bug*
in the component that submitted the I/O. The freezing and suspending of I/O
from inside the SCSI and NVMe drivers is a *workaround* for these bugs but is
not a full solution. Before the hibernation image is written to disk I/O is
resumed and the I/O requests that got deferred will be executed at that time.
In other words, suspending and resuming I/O from inside the SCSI and NVMe
drivers is a workaround and not a full solution. The following quote from
Documentation/power/swsusp.txt illustrates this:

 * BIG FAT WARNING *********************************************************
 *
 * If you touch anything on disk between suspend and resume...
 *				...kiss your data goodbye.

Bart.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-10-02 15:39 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20170925202924.16603-1-bart.vanassche@wdc.com>
2017-09-25 20:29 ` [PATCH v4 1/7] md: Make md resync and reshape threads freezable Bart Van Assche
2017-09-25 23:04   ` Ming Lei
2017-09-25 23:09     ` Bart Van Assche
2017-09-26  4:01       ` Ming Lei
2017-09-26  8:13         ` Ming Lei
2017-09-26 14:40           ` Bart Van Assche
2017-09-26 15:02             ` Ming Lei
2017-09-26  6:06   ` Hannes Reinecke
2017-09-26 11:17   ` Ming Lei
2017-09-26 14:42     ` Bart Van Assche
2017-09-26 14:59       ` Ming Lei
2017-09-27  3:12         ` Bart Van Assche
2017-09-27 11:00           ` Ming Lei
2017-10-02 15:39             ` Bart Van Assche
2017-10-02 13:26   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).