[PATCH]blk-mq-sched: remove the empty entry in token wait list

linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH]blk-mq-sched: remove the empty entry in token wait list
@ 2017-08-29 14:52 查斌
  2017-09-07 18:40 ` Omar Sandoval
  0 siblings, 1 reply; 2+ messages in thread
From: 查斌 @ 2017-08-29 14:52 UTC (permalink / raw)
  To: linux-block; +Cc: osandov, axboe, boyu.mt, wenqing.lz, qijiang.qj

From:  Bin Zha <zhabin.zb@alibaba-inc.com>


When the kyber adjusts the sync and other write depth to the
minimum(1), there is a case that maybe cause the requests to
be stalled in the kyber_hctx_data list.

The following example I have tested:


CPU7
block_rq_insert
add_wait_queue
__sbitmap_queue_get              CPU23
block_rq_issue                   block_rq_insert
block_rq_complete ------>  waiting token free
                                          block_rq_issue
          /|\                            block_rq_complete
           |                                        |
           |                                        |
           |                                       \|/
           |                                     CPU29
           |                            block_rq_insert
           |                           waiting token free
           |                             block_rq_issue
           |---------------------- block_rq_complete

                                                CPU1
                                       block_rq_insert
                                      waiting token free


The IO request complete in CPU29 will wake up CPU7,
because it has been added to the waitqueue in
kyber_get_domain_token. But it is empty waiter, and won't
wake up the CPU1.If no other requests issue to push it,
the requests will stall in kyber_hctx_data list.


Signed-off-by: Bin Zha <zhabin.zb@alibaba-inc.com>

diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index b9faabc..584bfd5 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -548,6 +548,10 @@ static int kyber_get_domain_token(struct
kyber_queue_data *kqd,
                 * queue.
                 */
                nr = __sbitmap_queue_get(domain_tokens);
+               if (nr >= 0) {
+                       remove_wait_queue(&ws->wait, wait);
+                       INIT_LIST_HEAD(&wait->task_list);
+               }
        }
        return nr;
 }

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH]blk-mq-sched: remove the empty entry in token wait list
  2017-08-29 14:52 [PATCH]blk-mq-sched: remove the empty entry in token wait list 查斌
@ 2017-09-07 18:40 ` Omar Sandoval
  0 siblings, 0 replies; 2+ messages in thread
From: Omar Sandoval @ 2017-09-07 18:40 UTC (permalink / raw)
  To: 查斌
  Cc: linux-block, osandov, axboe, boyu.mt, wenqing.lz, qijiang.qj

On Tue, Aug 29, 2017 at 10:52:13PM +0800, 查斌 wrote:
> From:  Bin Zha <zhabin.zb@alibaba-inc.com>
> 
> 
> When the kyber adjusts the sync and other write depth to the
> minimum(1), there is a case that maybe cause the requests to
> be stalled in the kyber_hctx_data list.
> 
> The following example I have tested:
> 
> 
> CPU7
> block_rq_insert
> add_wait_queue
> __sbitmap_queue_get              CPU23
> block_rq_issue                   block_rq_insert
> block_rq_complete ------>  waiting token free
>                                           block_rq_issue
>           /|\                            block_rq_complete
>            |                                        |
>            |                                        |
>            |                                       \|/
>            |                                     CPU29
>            |                            block_rq_insert
>            |                           waiting token free
>            |                             block_rq_issue
>            |---------------------- block_rq_complete
> 
>                                                 CPU1
>                                        block_rq_insert
>                                       waiting token free
> 
> 
> The IO request complete in CPU29 will wake up CPU7,
> because it has been added to the waitqueue in
> kyber_get_domain_token. But it is empty waiter, and won't
> wake up the CPU1.If no other requests issue to push it,
> the requests will stall in kyber_hctx_data list.
> 
> 
> Signed-off-by: Bin Zha <zhabin.zb@alibaba-inc.com>
> 
> diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
> index b9faabc..584bfd5 100644
> --- a/block/kyber-iosched.c
> +++ b/block/kyber-iosched.c
> @@ -548,6 +548,10 @@ static int kyber_get_domain_token(struct
> kyber_queue_data *kqd,
>                  * queue.
>                  */
>                 nr = __sbitmap_queue_get(domain_tokens);
> +               if (nr >= 0) {
> +                       remove_wait_queue(&ws->wait, wait);
> +                       INIT_LIST_HEAD(&wait->task_list);
> +               }
>         }
>         return nr;
>  }

Hm... I could've sworn I thought about this case when I wrote this code,
but your analysis looks correct. I do remember that I didn't do this
because I was worried about a race like so:

add_wait_queue()
                        kyber_domain_wake()
			\_ list_del_init()
remove_wait_queue()
\_ list_del()

But thinking about it some more, list_del() is probably safe after
list_del_init()?

Have you been able to reproduce this? I have the following patch to
force the domain token pools to one token, I imagine with the right
workload we can hit it:

diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index f58cab8..b4e1bd7 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -58,9 +58,9 @@ enum {
  * So, we cap these to a reasonable value.
  */
 static const unsigned int kyber_depth[] = {
-       [KYBER_READ] = 256,
-       [KYBER_SYNC_WRITE] = 128,
-       [KYBER_OTHER] = 64,
+       [KYBER_READ] = 1,
+       [KYBER_SYNC_WRITE] = 1,
+       [KYBER_OTHER] = 1,
 };

 /*
@@ -126,6 +126,7 @@ enum {
 #define IS_GOOD(status) ((status) > 0)
 #define IS_BAD(status) ((status) < 0)

+#if 0
 static int kyber_lat_status(struct blk_stat_callback *cb,
                            unsigned int sched_domain, u64 target)
 {
@@ -243,6 +244,7 @@ static void kyber_adjust_other_depth(struct kyber_queue_data *kqd,
        if (depth != orig_depth)
                sbitmap_queue_resize(&kqd->domain_tokens[KYBER_OTHER], depth);
 }
+#endif

 /*
  * Apply heuristics for limiting queue depths based on gathered latency
@@ -250,6 +252,8 @@ static void kyber_adjust_other_depth(struct kyber_queue_data *kqd,
  */
 static void kyber_stat_timer_fn(struct blk_stat_callback *cb)
 {
+       return;
+#if 0
        struct kyber_queue_data *kqd = cb->data;
        int read_status, write_status;

@@ -269,6 +273,7 @@ static void kyber_stat_timer_fn(struct blk_stat_callback *cb)
            ((IS_BAD(read_status) || IS_BAD(write_status) ||
              kqd->domain_tokens[KYBER_OTHER].sb.depth < kyber_depth[KYBER_OTHER])))
                blk_stat_activate_msecs(kqd->cb, 100);
+#endif
 }

 static unsigned int kyber_sched_tags_shift(struct kyber_queue_data *kqd)

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-09-07 18:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-29 14:52 [PATCH]blk-mq-sched: remove the empty entry in token wait list 查斌
2017-09-07 18:40 ` Omar Sandoval

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).