All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v4] blk-mq: reimplement timeout handling
@ 2018-01-08 19:15 Tejun Heo
  2018-01-08 19:15 ` [PATCH 1/8] blk-mq: move hctx lock/unlock into a helper Tejun Heo
                   ` (7 more replies)
  0 siblings, 8 replies; 41+ messages in thread
From: Tejun Heo @ 2018-01-08 19:15 UTC (permalink / raw)
  To: jack, axboe, clm, jbacik
  Cc: kernel-team, linux-kernel, linux-btrfs, peterz, jianchao.w.wang,
	Bart.VanAssche, linux-block

Hello,

Changes from [v3]

- Rebased on top of for-4.16/block.

- Integrated Jens's hctx_[un]lock() factoring patch and refreshed the
  patches accordingly.

- Added comment explaining the use of hctx_lock() instead of
  rcu_read_lock() in completion path.

Changes from [v2]

- Possible extended looping around seqcount and u64_stat_sync fixed.

- Misplaced MQ_RQ_IDLE state setting fixed.

- RQF_MQ_TIMEOUT_EXPIRED added to prevent firing the same timeout
  multiple times.

- s/queue_rq_src/srcu/ patch added.

- Other misc changes.

Changes from [v1]

- BLK_EH_RESET_TIMER handling fixed.

- s/request->gstate_seqc/request->gstate_seq/

- READ_ONCE() added to blk_mq_rq_udpate_state().

- Removed left over blk_clear_rq_complete() invocation from
  blk_mq_rq_timed_out().

Currently, blk-mq timeout path synchronizes against the usual
issue/completion path using a complex scheme involving atomic
bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
rules.  Unfortunatley, it contains quite a few holes.

It's pretty easy to make blk_mq_check_expired() terminate a later
instance of a request.  If we induce 5 sec delay before
time_after_eq() test in blk_mq_check_expired(), shorten the timeout to
2s, and issue back-to-back large IOs, blk-mq starts timing out
requests spuriously pretty quickly.  Nothing actually timed out.  It
just made the call on a recycle instance of a request and then
terminated a later instance long after the original instance finished.
The scenario isn't theoretical either.

This patchset replaces the broken synchronization mechanism with a RCU
and generation number based one.  Please read the patch description of
the second path for more details.

This patchset contains the following eight patches.

0001-blk-mq-move-hctx-lock-unlock-into-a-helper.patch
0002-blk-mq-protect-completion-path-with-RCU.patch
0003-blk-mq-replace-timeout-synchronization-with-a-RCU-an.patch
0004-blk-mq-use-blk_mq_rq_state-instead-of-testing-REQ_AT.patch
0005-blk-mq-make-blk_abort_request-trigger-timeout-path.patch
0006-blk-mq-remove-REQ_ATOM_COMPLETE-usages-from-blk-mq.patch
0007-blk-mq-remove-REQ_ATOM_STARTED.patch
0008-blk-mq-rename-blk_mq_hw_ctx-queue_rq_srcu-to-srcu.patch

and is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git blk-mq-timeout-v4

diffstat follows.  Thanks.

 block/blk-core.c       |    2 
 block/blk-mq-debugfs.c |    4 
 block/blk-mq.c         |  340 ++++++++++++++++++++++++++++---------------------
 block/blk-mq.h         |   49 ++++++-
 block/blk-timeout.c    |   11 -
 block/blk.h            |    7 -
 include/linux/blk-mq.h |    3 
 include/linux/blkdev.h |   25 +++
 8 files changed, 283 insertions(+), 158 deletions(-)

--
tejun

[v3] http://lkml.kernel.org/r/20171216120726.517153-1-tj@kernel.org
[v2] http://lkml.kernel.org/r/20171010155441.753966-1-tj@kernel.org
[v1] http://lkml.kernel.org/r/20171209192525.982030-1-tj@kernel.org

^ permalink raw reply	[flat|nested] 41+ messages in thread
* [PATCHSET v5] blk-mq: reimplement timeout handling
@ 2018-01-09 16:29 Tejun Heo
  2018-01-09 16:29 ` [PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo
  0 siblings, 1 reply; 41+ messages in thread
From: Tejun Heo @ 2018-01-09 16:29 UTC (permalink / raw)
  To: jack, axboe, clm, jbacik
  Cc: kernel-team, linux-kernel, linux-btrfs, peterz, jianchao.w.wang,
	Bart.VanAssche, linux-block

Hello,

Changes from [v4]

- Comments added.  Patch description updated.

Changes from [v3]

- Rebased on top of for-4.16/block.

- Integrated Jens's hctx_[un]lock() factoring patch and refreshed the
  patches accordingly.

- Added comment explaining the use of hctx_lock() instead of
  rcu_read_lock() in completion path.

Changes from [v2]

- Possible extended looping around seqcount and u64_stat_sync fixed.

- Misplaced MQ_RQ_IDLE state setting fixed.

- RQF_MQ_TIMEOUT_EXPIRED added to prevent firing the same timeout
  multiple times.

- s/queue_rq_src/srcu/ patch added.

- Other misc changes.

Changes from [v1]

- BLK_EH_RESET_TIMER handling fixed.

- s/request->gstate_seqc/request->gstate_seq/

- READ_ONCE() added to blk_mq_rq_udpate_state().

- Removed left over blk_clear_rq_complete() invocation from
  blk_mq_rq_timed_out().

Currently, blk-mq timeout path synchronizes against the usual
issue/completion path using a complex scheme involving atomic
bitflags, REQ_ATOM_*, memory barriers and subtle memory coherence
rules.  Unfortunatley, it contains quite a few holes.

It's pretty easy to make blk_mq_check_expired() terminate a later
instance of a request.  If we induce 5 sec delay before
time_after_eq() test in blk_mq_check_expired(), shorten the timeout to
2s, and issue back-to-back large IOs, blk-mq starts timing out
requests spuriously pretty quickly.  Nothing actually timed out.  It
just made the call on a recycle instance of a request and then
terminated a later instance long after the original instance finished.
The scenario isn't theoretical either.

This patchset replaces the broken synchronization mechanism with a RCU
and generation number based one.  Please read the patch description of
the second path for more details.

This patchset contains the following eight patches.

0001-blk-mq-move-hctx-lock-unlock-into-a-helper.patch
0002-blk-mq-protect-completion-path-with-RCU.patch
0003-blk-mq-replace-timeout-synchronization-with-a-RCU-an.patch
0004-blk-mq-use-blk_mq_rq_state-instead-of-testing-REQ_AT.patch
0005-blk-mq-make-blk_abort_request-trigger-timeout-path.patch
0006-blk-mq-remove-REQ_ATOM_COMPLETE-usages-from-blk-mq.patch
0007-blk-mq-remove-REQ_ATOM_STARTED.patch
0008-blk-mq-rename-blk_mq_hw_ctx-queue_rq_srcu-to-srcu.patch

and is available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git blk-mq-timeout-v5

diffstat follows.  Thanks.

 block/blk-core.c       |    2 
 block/blk-mq-debugfs.c |    4 
 block/blk-mq.c         |  346 +++++++++++++++++++++++++++++--------------------
 block/blk-mq.h         |   49 ++++++
 block/blk-timeout.c    |   16 +-
 block/blk.h            |    7 
 include/linux/blk-mq.h |    3 
 include/linux/blkdev.h |   25 +++
 8 files changed, 294 insertions(+), 158 deletions(-)

--
tejun

[v4] http://lkml.kernel.org/r/20180108191542.379478-1-tj@kernel.org
[v3] http://lkml.kernel.org/r/20171216120726.517153-1-tj@kernel.org
[v2] http://lkml.kernel.org/r/20171010155441.753966-1-tj@kernel.org
[v1] http://lkml.kernel.org/r/20171209192525.982030-1-tj@kernel.org

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2018-01-09 16:29 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-08 19:15 [PATCHSET v4] blk-mq: reimplement timeout handling Tejun Heo
2018-01-08 19:15 ` [PATCH 1/8] blk-mq: move hctx lock/unlock into a helper Tejun Heo
2018-01-08 19:24   ` Bart Van Assche
2018-01-08 19:24     ` Bart Van Assche
2018-01-08 19:15 ` [PATCH 2/8] blk-mq: protect completion path with RCU Tejun Heo
2018-01-08 19:57   ` Holger Hoffstätte
2018-01-08 20:15     ` Jens Axboe
2018-01-08 22:55       ` Jens Axboe
2018-01-08 23:27         ` Holger Hoffstätte
2018-01-08 23:33           ` Holger Hoffstätte
2018-01-09  7:08   ` Hannes Reinecke
2018-01-09 15:22     ` Jens Axboe
2018-01-09 16:12   ` Bart Van Assche
2018-01-09 16:12     ` Bart Van Assche
2018-01-09 16:17     ` Jens Axboe
2018-01-09 16:19     ` tj
2018-01-09 16:22       ` Jens Axboe
2018-01-08 19:15 ` [PATCH 3/8] blk-mq: replace timeout synchronization with a RCU and generation based scheme Tejun Heo
2018-01-08 21:06   ` Bart Van Assche
2018-01-08 21:06     ` Bart Van Assche
2018-01-09 15:46     ` tj
2018-01-08 23:29   ` Bart Van Assche
2018-01-08 23:29     ` Bart Van Assche
2018-01-09 15:46     ` tj
2018-01-08 19:15 ` [PATCH 4/8] blk-mq: use blk_mq_rq_state() instead of testing REQ_ATOM_COMPLETE Tejun Heo
2018-01-08 22:03   ` Bart Van Assche
2018-01-08 22:03     ` Bart Van Assche
2018-01-08 19:15 ` [PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo
2018-01-08 22:10   ` Bart Van Assche
2018-01-08 22:10     ` Bart Van Assche
2018-01-09 16:02     ` tj
2018-01-08 19:15 ` [PATCH 6/8] blk-mq: remove REQ_ATOM_COMPLETE usages from blk-mq Tejun Heo
2018-01-08 23:36   ` Bart Van Assche
2018-01-08 23:36     ` Bart Van Assche
2018-01-08 19:15 ` [PATCH 7/8] blk-mq: remove REQ_ATOM_STARTED Tejun Heo
2018-01-08 23:47   ` Bart Van Assche
2018-01-08 23:47     ` Bart Van Assche
2018-01-08 19:15 ` [PATCH 8/8] blk-mq: rename blk_mq_hw_ctx->queue_rq_srcu to ->srcu Tejun Heo
2018-01-08 23:48   ` Bart Van Assche
2018-01-08 23:48     ` Bart Van Assche
  -- strict thread matches above, loose matches on Subject: below --
2018-01-09 16:29 [PATCHSET v5] blk-mq: reimplement timeout handling Tejun Heo
2018-01-09 16:29 ` [PATCH 5/8] blk-mq: make blk_abort_request() trigger timeout path Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.