All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Device mapper and dm-mpath fixes
@ 2017-08-09 18:32 Bart Van Assche
  2017-08-09 18:32 ` [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio() Bart Van Assche
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Bart Van Assche, dm-devel, Christoph Hellwig

Hello Mike,

The patches in this series are what I came up with while analyzing the
soft and hard lockups reported by Laurence Oberman. Please consider
these patches for kernel v4.14.

Thank you,

Bart.

Bart Van Assche (7):
  dm: Fix the second dec_pending() argument in __split_and_process_bio()
  dm: Fix printk() rate limiting code
  dm-mpath: Do not lock up a CPU with requeuing activity
  dm-mpath: Avoid that building with W=1 causes gcc 7 to complain about
    fall-through
  dm-mpath: Complain about unsupported __multipath_map_bio() return
    values
  dm-mpath: Retry BLK_STS_RESOURCE errors
  dm-mpath: Make the dm-sq requeuing behavior consistent with the dm-mq
    behavior

 drivers/md/dm-mpath.c         |  8 ++++++--
 drivers/md/dm-rq.c            |  9 +++++----
 drivers/md/dm.c               | 12 +-----------
 include/linux/device-mapper.h | 41 ++++++++++++-----------------------------
 4 files changed, 24 insertions(+), 46 deletions(-)

-- 
2.13.3

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio()
  2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
@ 2017-08-09 18:32 ` Bart Van Assche
  2017-08-10  9:28   ` Christoph Hellwig
  2017-08-10 14:28   ` Laurence Oberman
  2017-08-09 18:32 ` [PATCH 2/7] dm: Fix printk() rate limiting code Bart Van Assche
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Bart Van Assche, dm-devel, Laurence Oberman, Christoph Hellwig

Detected by sparse.

Fixes: commit 4e4cbee93d56 ("block: switch bios to blk_status_t")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Laurence Oberman <loberman@redhat.com>
---
 drivers/md/dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 2edbcc2d7d3f..8298670757e9 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1523,7 +1523,7 @@ static void __split_and_process_bio(struct mapped_device *md,
 	}
 
 	/* drop the extra reference count */
-	dec_pending(ci.io, error);
+	dec_pending(ci.io, errno_to_blk_status(error));
 }
 /*-----------------------------------------------------------------
  * CRUD END
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/7] dm: Fix printk() rate limiting code
  2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
  2017-08-09 18:32 ` [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio() Bart Van Assche
@ 2017-08-09 18:32 ` Bart Van Assche
  2017-08-09 18:32   ` Bart Van Assche
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Laurence Oberman, Namhyung Kim, dm-devel, Bart Van Assche,
	Christoph Hellwig, Alasdair G Kergon

Using the same rate limiting state for different kinds of messages
is wrong because this can cause a high frequency message to suppress
a report of a low frequency message. Hence use a unique rate limiting
state per message type.

Fixes: commit 71a16736a15e ("dm: use local printk ratelimit")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
---
 drivers/md/dm.c               | 10 ----------
 include/linux/device-mapper.h | 41 ++++++++++++-----------------------------
 2 files changed, 12 insertions(+), 39 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 8298670757e9..d669fddd9290 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -27,16 +27,6 @@
 
 #define DM_MSG_PREFIX "core"
 
-#ifdef CONFIG_PRINTK
-/*
- * ratelimit state to be used in DMXXX_LIMIT().
- */
-DEFINE_RATELIMIT_STATE(dm_ratelimit_state,
-		       DEFAULT_RATELIMIT_INTERVAL,
-		       DEFAULT_RATELIMIT_BURST);
-EXPORT_SYMBOL(dm_ratelimit_state);
-#endif
-
 /*
  * Cookies are numeric values sent with CHANGE and REMOVE
  * uevents while resuming, removing or renaming the device.
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 1473455d0341..4f2b3b2076c4 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -549,46 +549,29 @@ void *dm_vcalloc(unsigned long nmemb, unsigned long elem_size);
  *---------------------------------------------------------------*/
 #define DM_NAME "device-mapper"
 
-#ifdef CONFIG_PRINTK
-extern struct ratelimit_state dm_ratelimit_state;
-
-#define dm_ratelimit()	__ratelimit(&dm_ratelimit_state)
-#else
-#define dm_ratelimit()	0
-#endif
+#define DM_RATELIMIT(pr_func, fmt, ...)					\
+do {									\
+	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,	\
+				      DEFAULT_RATELIMIT_BURST);		\
+									\
+	if (__ratelimit(&rs))						\
+		pr_func(DM_FMT(fmt), ##__VA_ARGS__);			\
+} while (0)
 
 #define DM_FMT(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt "\n"
 
 #define DMCRIT(fmt, ...) pr_crit(DM_FMT(fmt), ##__VA_ARGS__)
 
 #define DMERR(fmt, ...) pr_err(DM_FMT(fmt), ##__VA_ARGS__)
-#define DMERR_LIMIT(fmt, ...)						\
-do {									\
-	if (dm_ratelimit())						\
-		DMERR(fmt, ##__VA_ARGS__);				\
-} while (0)
-
+#define DMERR_LIMIT(fmt, ...) DM_RATELIMIT(pr_err, fmt, ##__VA_ARGS__)
 #define DMWARN(fmt, ...) pr_warn(DM_FMT(fmt), ##__VA_ARGS__)
-#define DMWARN_LIMIT(fmt, ...)						\
-do {									\
-	if (dm_ratelimit())						\
-		DMWARN(fmt, ##__VA_ARGS__);				\
-} while (0)
-
+#define DMWARN_LIMIT(fmt, ...) DM_RATELIMIT(pr_warn, fmt, ##__VA_ARGS__)
 #define DMINFO(fmt, ...) pr_info(DM_FMT(fmt), ##__VA_ARGS__)
-#define DMINFO_LIMIT(fmt, ...)						\
-do {									\
-	if (dm_ratelimit())						\
-		DMINFO(fmt, ##__VA_ARGS__);				\
-} while (0)
+#define DMINFO_LIMIT(fmt, ...) DM_RATELIMIT(pr_info, fmt, ##__VA_ARGS__)
 
 #ifdef CONFIG_DM_DEBUG
 #define DMDEBUG(fmt, ...) printk(KERN_DEBUG DM_FMT(fmt), ##__VA_ARGS__)
-#define DMDEBUG_LIMIT(fmt, ...)						\
-do {									\
-	if (dm_ratelimit())						\
-		DMDEBUG(fmt, ##__VA_ARGS__);				\
-} while (0)
+#define DMDEBUG_LIMIT(fmt, ...) DM_RATELIMIT(pr_debug, fmt, ##__VA_ARGS__)
 #else
 #define DMDEBUG(fmt, ...) no_printk(fmt, ##__VA_ARGS__)
 #define DMDEBUG_LIMIT(fmt, ...) no_printk(fmt, ##__VA_ARGS__)
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/7] dm-mpath: Do not lock up a CPU with requeuing activity
  2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
@ 2017-08-09 18:32   ` Bart Van Assche
  2017-08-09 18:32 ` [PATCH 2/7] dm: Fix printk() rate limiting code Bart Van Assche
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: dm-devel, Christoph Hellwig, Bart Van Assche, Laurence Oberman,
	stable

When using the block layer in single queue mode, get_request()
returns ERR_PTR(-EAGAIN) if the queue is dying and the REQ_NOWAIT
flag has been passed to get_request(). Avoid that the kernel
reports soft lockup complaints in this case due to continuous
requeuing activity.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: <stable@vger.kernel.org>
---
 drivers/md/dm-mpath.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 0e8ab5bb3575..0baa461eccaa 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -504,7 +504,6 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 		if (queue_dying) {
 			atomic_inc(&m->pg_init_in_progress);
 			activate_or_offline_path(pgpath);
-			return DM_MAPIO_REQUEUE;
 		}
 		return DM_MAPIO_DELAY_REQUEUE;
 	}
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/7] dm-mpath: Do not lock up a CPU with requeuing activity
@ 2017-08-09 18:32   ` Bart Van Assche
  0 siblings, 0 replies; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: dm-devel, Christoph Hellwig, Bart Van Assche, Laurence Oberman,
	stable

When using the block layer in single queue mode, get_request()
returns ERR_PTR(-EAGAIN) if the queue is dying and the REQ_NOWAIT
flag has been passed to get_request(). Avoid that the kernel
reports soft lockup complaints in this case due to continuous
requeuing activity.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: <stable@vger.kernel.org>
---
 drivers/md/dm-mpath.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 0e8ab5bb3575..0baa461eccaa 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -504,7 +504,6 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 		if (queue_dying) {
 			atomic_inc(&m->pg_init_in_progress);
 			activate_or_offline_path(pgpath);
-			return DM_MAPIO_REQUEUE;
 		}
 		return DM_MAPIO_DELAY_REQUEUE;
 	}
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/7] dm-mpath: Avoid that building with W=1 causes gcc 7 to complain about fall-through
  2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
                   ` (2 preceding siblings ...)
  2017-08-09 18:32   ` Bart Van Assche
@ 2017-08-09 18:32 ` Bart Van Assche
  2017-08-10  9:29   ` Christoph Hellwig
  2017-08-09 18:32 ` [PATCH 5/7] dm-mpath: Complain about unsupported __multipath_map_bio() return values Bart Van Assche
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Bart Van Assche, dm-devel, Laurence Oberman, Christoph Hellwig

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Laurence Oberman <loberman@redhat.com>
---
 drivers/md/dm-mpath.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 0baa461eccaa..6e20deac4add 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1379,6 +1379,7 @@ static void pg_init_done(void *data, int errors)
 	case SCSI_DH_RETRY:
 		/* Wait before retrying. */
 		delay_retry = 1;
+		/* fall through */
 	case SCSI_DH_IMM_RETRY:
 	case SCSI_DH_RES_TEMP_UNAVAIL:
 		if (pg_init_limit_reached(m, pgpath))
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/7] dm-mpath: Complain about unsupported __multipath_map_bio() return values
  2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
                   ` (3 preceding siblings ...)
  2017-08-09 18:32 ` [PATCH 4/7] dm-mpath: Avoid that building with W=1 causes gcc 7 to complain about fall-through Bart Van Assche
@ 2017-08-09 18:32 ` Bart Van Assche
  2017-08-10  9:29   ` Christoph Hellwig
  2017-08-09 18:32 ` [PATCH 6/7] dm-mpath: Retry BLK_STS_RESOURCE errors Bart Van Assche
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Bart Van Assche, dm-devel, Laurence Oberman, Christoph Hellwig

The only behavior change introduced by this patch is that WARN_ON_ONCE()
is called if __multipath_map_bio() would return an unsupported return
value.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Laurence Oberman <loberman@redhat.com>
---
 drivers/md/dm-mpath.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 6e20deac4add..f2ffa20c69b9 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -632,6 +632,11 @@ static void process_queued_bios(struct work_struct *work)
 		case DM_MAPIO_REMAPPED:
 			generic_make_request(bio);
 			break;
+		case 0:
+			break;
+		default:
+			WARN_ONCE(true, "__multipath_map_bio() returned %d\n",
+				  r);
 		}
 	}
 	blk_finish_plug(&plug);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/7] dm-mpath: Retry BLK_STS_RESOURCE errors
  2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
                   ` (4 preceding siblings ...)
  2017-08-09 18:32 ` [PATCH 5/7] dm-mpath: Complain about unsupported __multipath_map_bio() return values Bart Van Assche
@ 2017-08-09 18:32 ` Bart Van Assche
  2017-08-10  9:30   ` Christoph Hellwig
  2017-08-09 18:32 ` [PATCH 7/7] dm-mpath: Make the dm-sq requeuing behavior consistent with the dm-mq behavior Bart Van Assche
  2017-08-09 21:49 ` [PATCH 0/7] Device mapper and dm-mpath fixes Mike Snitzer
  7 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Bart Van Assche, dm-devel, Laurence Oberman, Christoph Hellwig

Retry requests instead of failing these if an out-of-memory error
occurs or the block driver below dm-mpath is busy. This restores
the v4.12 behavior of noretry_error(), namely that -ENOMEM results
in a retry.

Fixes: commit 2a842acab109 ("block: introduce new block status code type")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Laurence Oberman <loberman@redhat.com>
---
 drivers/md/dm-mpath.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index f2ffa20c69b9..3c1f8a89b10f 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1463,7 +1463,6 @@ static int noretry_error(blk_status_t error)
 	case BLK_STS_TARGET:
 	case BLK_STS_NEXUS:
 	case BLK_STS_MEDIUM:
-	case BLK_STS_RESOURCE:
 		return 1;
 	}
 
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/7] dm-mpath: Make the dm-sq requeuing behavior consistent with the dm-mq behavior
  2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
                   ` (5 preceding siblings ...)
  2017-08-09 18:32 ` [PATCH 6/7] dm-mpath: Retry BLK_STS_RESOURCE errors Bart Van Assche
@ 2017-08-09 18:32 ` Bart Van Assche
  2017-08-10  9:30   ` Christoph Hellwig
  2017-08-09 21:49 ` [PATCH 0/7] Device mapper and dm-mpath fixes Mike Snitzer
  7 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2017-08-09 18:32 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: Bart Van Assche, dm-devel, Laurence Oberman, Christoph Hellwig

DM_MAPIO_DELAY_REQUEUE causes dm-mq to requeue after a delay but
causes dm-sq to requeue immediately. Make the behavior of dm-sq
consistent with that of dm-mq.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Laurence Oberman <loberman@redhat.com>
---
 drivers/md/dm-rq.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index c6ebc5b1e00e..cbee09054d1e 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -237,14 +237,14 @@ static void dm_end_request(struct request *clone, blk_status_t error)
 /*
  * Requeue the original request of a clone.
  */
-static void dm_old_requeue_request(struct request *rq)
+static void dm_old_requeue_request(struct request *rq, unsigned long delay_ms)
 {
 	struct request_queue *q = rq->q;
 	unsigned long flags;
 
 	spin_lock_irqsave(q->queue_lock, flags);
 	blk_requeue_request(q, rq);
-	blk_run_queue_async(q);
+	blk_delay_queue(q, delay_ms);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 }
 
@@ -270,6 +270,7 @@ static void dm_requeue_original_request(struct dm_rq_target_io *tio, bool delay_
 	struct mapped_device *md = tio->md;
 	struct request *rq = tio->orig;
 	int rw = rq_data_dir(rq);
+	unsigned long delay_ms = delay_requeue ? 100 : 0;
 
 	rq_end_stats(md, rq);
 	if (tio->clone) {
@@ -278,9 +279,9 @@ static void dm_requeue_original_request(struct dm_rq_target_io *tio, bool delay_
 	}
 
 	if (!rq->q->mq_ops)
-		dm_old_requeue_request(rq);
+		dm_old_requeue_request(rq, delay_ms);
 	else
-		dm_mq_delay_requeue_request(rq, delay_requeue ? 100/*ms*/ : 0);
+		dm_mq_delay_requeue_request(rq, delay_ms);
 
 	rq_completed(md, rw, false);
 }
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/7] Device mapper and dm-mpath fixes
  2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
                   ` (6 preceding siblings ...)
  2017-08-09 18:32 ` [PATCH 7/7] dm-mpath: Make the dm-sq requeuing behavior consistent with the dm-mq behavior Bart Van Assche
@ 2017-08-09 21:49 ` Mike Snitzer
  7 siblings, 0 replies; 18+ messages in thread
From: Mike Snitzer @ 2017-08-09 21:49 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: dm-devel, Christoph Hellwig

On Wed, Aug 09 2017 at  2:32pm -0400,
Bart Van Assche <bart.vanassche@wdc.com> wrote:

> Hello Mike,
> 
> The patches in this series are what I came up with while analyzing the
> soft and hard lockups reported by Laurence Oberman. Please consider
> these patches for kernel v4.14.
> 
> Thank you,
> 
> Bart.
> 
> Bart Van Assche (7):
>   dm: Fix the second dec_pending() argument in __split_and_process_bio()
>   dm: Fix printk() rate limiting code
>   dm-mpath: Do not lock up a CPU with requeuing activity
>   dm-mpath: Avoid that building with W=1 causes gcc 7 to complain about
>     fall-through
>   dm-mpath: Complain about unsupported __multipath_map_bio() return
>     values
>   dm-mpath: Retry BLK_STS_RESOURCE errors
>   dm-mpath: Make the dm-sq requeuing behavior consistent with the dm-mq
>     behavior
> 
>  drivers/md/dm-mpath.c         |  8 ++++++--
>  drivers/md/dm-rq.c            |  9 +++++----
>  drivers/md/dm.c               | 12 +-----------
>  include/linux/device-mapper.h | 41 ++++++++++++-----------------------------
>  4 files changed, 24 insertions(+), 46 deletions(-)

Hey Bart,

Thanks for the patches.  I'm currently on a leave of absence to recover
from a nasty fall I had (broke my sacrum and dislocated my left thumb at
the wrist)... so: I won't be getting to these patches for a couple
weeks.

Mike

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio()
  2017-08-09 18:32 ` [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio() Bart Van Assche
@ 2017-08-10  9:28   ` Christoph Hellwig
  2017-08-10 14:28   ` Laurence Oberman
  1 sibling, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:28 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: dm-devel, Laurence Oberman, Christoph Hellwig, Mike Snitzer

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/7] dm-mpath: Do not lock up a CPU with requeuing activity
  2017-08-09 18:32   ` Bart Van Assche
  (?)
@ 2017-08-10  9:29   ` Christoph Hellwig
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:29 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Mike Snitzer, dm-devel, Christoph Hellwig, Laurence Oberman,
	stable

On Wed, Aug 09, 2017 at 11:32:12AM -0700, Bart Van Assche wrote:
> When using the block layer in single queue mode, get_request()
> returns ERR_PTR(-EAGAIN) if the queue is dying and the REQ_NOWAIT
> flag has been passed to get_request(). Avoid that the kernel
> reports soft lockup complaints in this case due to continuous
> requeuing activity.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/7] dm-mpath: Avoid that building with W=1 causes gcc 7 to complain about fall-through
  2017-08-09 18:32 ` [PATCH 4/7] dm-mpath: Avoid that building with W=1 causes gcc 7 to complain about fall-through Bart Van Assche
@ 2017-08-10  9:29   ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:29 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: dm-devel, Laurence Oberman, Christoph Hellwig, Mike Snitzer

On Wed, Aug 09, 2017 at 11:32:13AM -0700, Bart Van Assche wrote:
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Laurence Oberman <loberman@redhat.com>
> ---
>  drivers/md/dm-mpath.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 0baa461eccaa..6e20deac4add 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -1379,6 +1379,7 @@ static void pg_init_done(void *data, int errors)
>  	case SCSI_DH_RETRY:
>  		/* Wait before retrying. */
>  		delay_retry = 1;
> +		/* fall through */
>  	case SCSI_DH_IMM_RETRY:
>  	case SCSI_DH_RES_TEMP_UNAVAIL:
>  		if (pg_init_limit_reached(m, pgpath))

I prefer /*FALLTHRU*/ over the less shouty lower case version, but either
way this looks okay:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/7] dm-mpath: Complain about unsupported __multipath_map_bio() return values
  2017-08-09 18:32 ` [PATCH 5/7] dm-mpath: Complain about unsupported __multipath_map_bio() return values Bart Van Assche
@ 2017-08-10  9:29   ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:29 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: dm-devel, Laurence Oberman, Christoph Hellwig, Mike Snitzer

On Wed, Aug 09, 2017 at 11:32:14AM -0700, Bart Van Assche wrote:
> The only behavior change introduced by this patch is that WARN_ON_ONCE()
> is called if __multipath_map_bio() would return an unsupported return
> value.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/7] dm-mpath: Retry BLK_STS_RESOURCE errors
  2017-08-09 18:32 ` [PATCH 6/7] dm-mpath: Retry BLK_STS_RESOURCE errors Bart Van Assche
@ 2017-08-10  9:30   ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:30 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: dm-devel, Laurence Oberman, Christoph Hellwig, Mike Snitzer

On Wed, Aug 09, 2017 at 11:32:15AM -0700, Bart Van Assche wrote:
> Retry requests instead of failing these if an out-of-memory error
> occurs or the block driver below dm-mpath is busy. This restores
> the v4.12 behavior of noretry_error(), namely that -ENOMEM results
> in a retry.

Thanks,

not idea how the resource one leaked in.

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 7/7] dm-mpath: Make the dm-sq requeuing behavior consistent with the dm-mq behavior
  2017-08-09 18:32 ` [PATCH 7/7] dm-mpath: Make the dm-sq requeuing behavior consistent with the dm-mq behavior Bart Van Assche
@ 2017-08-10  9:30   ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2017-08-10  9:30 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: dm-devel, Laurence Oberman, Christoph Hellwig, Mike Snitzer

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio()
  2017-08-09 18:32 ` [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio() Bart Van Assche
  2017-08-10  9:28   ` Christoph Hellwig
@ 2017-08-10 14:28   ` Laurence Oberman
  2017-08-10 15:02     ` Bart Van Assche
  1 sibling, 1 reply; 18+ messages in thread
From: Laurence Oberman @ 2017-08-10 14:28 UTC (permalink / raw)
  To: Bart Van Assche, Mike Snitzer; +Cc: dm-devel, Christoph Hellwig

On Wed, 2017-08-09 at 11:32 -0700, Bart Van Assche wrote:
> Detected by sparse.
> 
> Fixes: commit 4e4cbee93d56 ("block: switch bios to blk_status_t")
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Laurence Oberman <loberman@redhat.com>
> ---
>  drivers/md/dm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 2edbcc2d7d3f..8298670757e9 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1523,7 +1523,7 @@ static void __split_and_process_bio(struct
> mapped_device *md,
>  	}
>  
>  	/* drop the extra reference count */
> -	dec_pending(ci.io, error);
> +	dec_pending(ci.io, errno_to_blk_status(error));
>  }
>  /*-----------------------------------------------------------------
>   * CRUD END

Hello Bart

I applied and tested all patches except patch 2/7 (as discussed)
I also still have the prior patch in place you sent me.

I built a new kernel on 4.13-rc3

The soft lockups are gone for me with 10 parallel writes.
I also have not seen a hard lockup so far.

What is important to note is that if I use the [none] scheduler I am
still stable but I will sporadically get the      ******
"[38071.773471] device-mapper: multipath: blk_get_request() returned
-11 - requeuing
"

However that is not unexpected I guess.

The ( requeuing messages) do NOT show up with [mq-deadline] which is
all I was testing with last night and yesterday. So when I said they
are gone, that was specifically for [mq-deadline]

Patches look good and are helping. I will circle back and patch and
test on Ming's kernel.

For the series except PATCH2/7 and including

[PATCH] block: Make blk_mq_delay_kick_requeue_list() rerun the queue at
a quiet time


Tested-by: Laurence Oberman <loberman@redhat.com>

Thanks
Laurence




--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio()
  2017-08-10 14:28   ` Laurence Oberman
@ 2017-08-10 15:02     ` Bart Van Assche
  0 siblings, 0 replies; 18+ messages in thread
From: Bart Van Assche @ 2017-08-10 15:02 UTC (permalink / raw)
  To: loberman@redhat.com, snitzer@redhat.com; +Cc: dm-devel@redhat.com, hch@lst.de

On Thu, 2017-08-10 at 10:28 -0400, Laurence Oberman wrote:
> The soft lockups are gone for me with 10 parallel writes.
> I also have not seen a hard lockup so far.

That's great news!

> What is important to note is that if I use the [none] scheduler I am
> still stable but I will sporadically get the      ******
> "[38071.773471] device-mapper: multipath: blk_get_request() returned
> -11 - requeuing"
> 
> However that is not unexpected I guess.

Yes, this is expected.

> For the series except PATCH2/7 and including
> 
> [PATCH] block: Make blk_mq_delay_kick_requeue_list() rerun the queue at
> a quiet time
> 
> Tested-by: Laurence Oberman <loberman@redhat.com>

Thanks for the testing!

Bart.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-08-10 15:02 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-09 18:32 [PATCH 0/7] Device mapper and dm-mpath fixes Bart Van Assche
2017-08-09 18:32 ` [PATCH 1/7] dm: Fix the second dec_pending() argument in __split_and_process_bio() Bart Van Assche
2017-08-10  9:28   ` Christoph Hellwig
2017-08-10 14:28   ` Laurence Oberman
2017-08-10 15:02     ` Bart Van Assche
2017-08-09 18:32 ` [PATCH 2/7] dm: Fix printk() rate limiting code Bart Van Assche
2017-08-09 18:32 ` [PATCH 3/7] dm-mpath: Do not lock up a CPU with requeuing activity Bart Van Assche
2017-08-09 18:32   ` Bart Van Assche
2017-08-10  9:29   ` Christoph Hellwig
2017-08-09 18:32 ` [PATCH 4/7] dm-mpath: Avoid that building with W=1 causes gcc 7 to complain about fall-through Bart Van Assche
2017-08-10  9:29   ` Christoph Hellwig
2017-08-09 18:32 ` [PATCH 5/7] dm-mpath: Complain about unsupported __multipath_map_bio() return values Bart Van Assche
2017-08-10  9:29   ` Christoph Hellwig
2017-08-09 18:32 ` [PATCH 6/7] dm-mpath: Retry BLK_STS_RESOURCE errors Bart Van Assche
2017-08-10  9:30   ` Christoph Hellwig
2017-08-09 18:32 ` [PATCH 7/7] dm-mpath: Make the dm-sq requeuing behavior consistent with the dm-mq behavior Bart Van Assche
2017-08-10  9:30   ` Christoph Hellwig
2017-08-09 21:49 ` [PATCH 0/7] Device mapper and dm-mpath fixes Mike Snitzer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.