qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves
@ 2018-07-04 14:54 Stefan Hajnoczi
  2018-07-11 12:32 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2018-07-04 14:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Kevin Wolf, Max Reitz, Alberto Garcia,
	Stefan Hajnoczi

Throttle groups consist of members sharing one throttling state
(including bps/iops limits).  Round-robin scheduling is used to ensure
fairness.  If a group member already has a timer pending then other
groups members do not schedule their own timers.  The next group member
will have its turn when the existing timer expires.

A hang may occur when a group member leaves while it had a timer
scheduled.  Although the code carefully removes the group member from
the round-robin list, it does not schedule the next member.  Therefore
remaining members continue to wait for the removed member's timer to
expire.

This patch schedules the next request if a timer is pending.
Unfortunately the actual bug is a race condition that I've been unable
to capture in a test case.

Sometimes drive2 hangs when drive1 is removed from the throttling group:

  $ qemu ... -drive if=none,id=drive1,cache=none,format=qcow2,file=data1.qcow2,iops=100,group=foo \
             -device virtio-blk-pci,id=virtio-blk-pci0,drive=drive1 \
             -drive if=none,id=drive2,cache=none,format=qcow2,file=data2.qcow2,iops=10,group=foo \
             -device virtio-blk-pci,id=virtio-blk-pci1,drive=drive2
  (guest-console1)# fio -filename /dev/vda 4k-seq-read.job
  (guest-console2)# fio -filename /dev/vdb 4k-seq-read.job
  (qmp) {"execute": "block_set_io_throttle", "arguments": {"device": "drive1","bps": 0,"bps_rd": 0,"bps_wr": 0,"iops": 0,"iops_rd": 0,"iops_wr": 0}}

Reported-by: Nini Gu <ngu@redhat.com>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1535914
Cc: Alberto Garcia <berto@igalia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/throttle-groups.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 36cc0430c3..e297b04e17 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -564,6 +564,10 @@ void throttle_group_unregister_tgm(ThrottleGroupMember *tgm)
 
     qemu_mutex_lock(&tg->lock);
     for (i = 0; i < 2; i++) {
+        if (timer_pending(tgm->throttle_timers.timers[i])) {
+            tg->any_timer_armed[i] = false;
+            schedule_next_request(tgm, i);
+        }
         if (tg->tokens[i] == tgm) {
             token = throttle_group_next_tgm(tgm);
             /* Take care of the case where this is the last tgm in the group */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] throttle-groups: fix hang when group member leaves
  2018-07-04 14:54 [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves Stefan Hajnoczi
@ 2018-07-11 12:32 ` Stefan Hajnoczi
  2018-07-13 12:19 ` Kashyap Chamarthy
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2018-07-11 12:32 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Kevin Wolf, qemu-block, Max Reitz, Alberto Garcia

[-- Attachment #1: Type: text/plain, Size: 2658 bytes --]

On Wed, Jul 04, 2018 at 03:54:10PM +0100, Stefan Hajnoczi wrote:

Sorry you weren't CCed originally, Berto.  This one is for you! :)

> Throttle groups consist of members sharing one throttling state
> (including bps/iops limits).  Round-robin scheduling is used to ensure
> fairness.  If a group member already has a timer pending then other
> groups members do not schedule their own timers.  The next group member
> will have its turn when the existing timer expires.
> 
> A hang may occur when a group member leaves while it had a timer
> scheduled.  Although the code carefully removes the group member from
> the round-robin list, it does not schedule the next member.  Therefore
> remaining members continue to wait for the removed member's timer to
> expire.
> 
> This patch schedules the next request if a timer is pending.
> Unfortunately the actual bug is a race condition that I've been unable
> to capture in a test case.
> 
> Sometimes drive2 hangs when drive1 is removed from the throttling group:
> 
>   $ qemu ... -drive if=none,id=drive1,cache=none,format=qcow2,file=data1.qcow2,iops=100,group=foo \
>              -device virtio-blk-pci,id=virtio-blk-pci0,drive=drive1 \
>              -drive if=none,id=drive2,cache=none,format=qcow2,file=data2.qcow2,iops=10,group=foo \
>              -device virtio-blk-pci,id=virtio-blk-pci1,drive=drive2
>   (guest-console1)# fio -filename /dev/vda 4k-seq-read.job
>   (guest-console2)# fio -filename /dev/vdb 4k-seq-read.job
>   (qmp) {"execute": "block_set_io_throttle", "arguments": {"device": "drive1","bps": 0,"bps_rd": 0,"bps_wr": 0,"iops": 0,"iops_rd": 0,"iops_wr": 0}}
> 
> Reported-by: Nini Gu <ngu@redhat.com>
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1535914
> Cc: Alberto Garcia <berto@igalia.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/throttle-groups.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/block/throttle-groups.c b/block/throttle-groups.c
> index 36cc0430c3..e297b04e17 100644
> --- a/block/throttle-groups.c
> +++ b/block/throttle-groups.c
> @@ -564,6 +564,10 @@ void throttle_group_unregister_tgm(ThrottleGroupMember *tgm)
>  
>      qemu_mutex_lock(&tg->lock);
>      for (i = 0; i < 2; i++) {
> +        if (timer_pending(tgm->throttle_timers.timers[i])) {
> +            tg->any_timer_armed[i] = false;
> +            schedule_next_request(tgm, i);
> +        }
>          if (tg->tokens[i] == tgm) {
>              token = throttle_group_next_tgm(tgm);
>              /* Take care of the case where this is the last tgm in the group */
> -- 
> 2.17.1
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] throttle-groups: fix hang when group member leaves
  2018-07-04 14:54 [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves Stefan Hajnoczi
  2018-07-11 12:32 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
@ 2018-07-13 12:19 ` Kashyap Chamarthy
  2018-07-17  8:11 ` Stefan Hajnoczi
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Kashyap Chamarthy @ 2018-07-13 12:19 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Kevin Wolf, qemu-block, Max Reitz

On Wed, Jul 04, 2018 at 03:54:10PM +0100, Stefan Hajnoczi wrote:
> Throttle groups consist of members sharing one throttling state
> (including bps/iops limits).  Round-robin scheduling is used to ensure
> fairness.  If a group member already has a timer pending then other
> groups members do not schedule their own timers.  The next group member
> will have its turn when the existing timer expires.
> 
> A hang may occur when a group member leaves while it had a timer
> scheduled.  Although the code carefully removes the group member from
> the round-robin list, it does not schedule the next member.  Therefore
> remaining members continue to wait for the removed member's timer to
> expire.
> 
> This patch schedules the next request if a timer is pending.
> Unfortunately the actual bug is a race condition that I've been unable
> to capture in a test case.
> 
> Sometimes drive2 hangs when drive1 is removed from the throttling group:
> 
>   $ qemu ... -drive if=none,id=drive1,cache=none,format=qcow2,file=data1.qcow2,iops=100,group=foo \
>              -device virtio-blk-pci,id=virtio-blk-pci0,drive=drive1 \
>              -drive if=none,id=drive2,cache=none,format=qcow2,file=data2.qcow2,iops=10,group=foo \
>              -device virtio-blk-pci,id=virtio-blk-pci1,drive=drive2
>   (guest-console1)# fio -filename /dev/vda 4k-seq-read.job
>   (guest-console2)# fio -filename /dev/vdb 4k-seq-read.job
>   (qmp) {"execute": "block_set_io_throttle", "arguments": {"device": "drive1","bps": 0,"bps_rd": 0,"bps_wr": 0,"iops": 0,"iops_rd": 0,"iops_wr": 0}}

Hi Stefan,

I realize you want to preserve the long lines to not break the JSON QMP
command.  But, FWIW, you might want to format it using one of the
convenient websites: https://jsonformatter.org/

So your QMP command nicely wraps (for the 'cost' of 11 extra lines):

    {
      "execute": "block_set_io_throttle",
      "arguments": {
        "device": "drive1",
        "bps": 0,
        "bps_rd": 0,
        "bps_wr": 0,
        "iops": 0,
        "iops_rd": 0,
        "iops_wr": 0
      }
    }


[...]

-- 
/kashyap

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [Qemu-block] [PATCH] throttle-groups: fix hang when group member leaves
  2018-07-04 14:54 [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves Stefan Hajnoczi
  2018-07-11 12:32 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
  2018-07-13 12:19 ` Kashyap Chamarthy
@ 2018-07-17  8:11 ` Stefan Hajnoczi
  2018-07-31 13:47 ` [Qemu-devel] " Alberto Garcia
  2018-07-31 16:47 ` Alberto Garcia
  4 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2018-07-17  8:11 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Kevin Wolf, qemu-block, Max Reitz

[-- Attachment #1: Type: text/plain, Size: 2100 bytes --]

On Wed, Jul 04, 2018 at 03:54:10PM +0100, Stefan Hajnoczi wrote:
> Throttle groups consist of members sharing one throttling state
> (including bps/iops limits).  Round-robin scheduling is used to ensure
> fairness.  If a group member already has a timer pending then other
> groups members do not schedule their own timers.  The next group member
> will have its turn when the existing timer expires.
> 
> A hang may occur when a group member leaves while it had a timer
> scheduled.  Although the code carefully removes the group member from
> the round-robin list, it does not schedule the next member.  Therefore
> remaining members continue to wait for the removed member's timer to
> expire.
> 
> This patch schedules the next request if a timer is pending.
> Unfortunately the actual bug is a race condition that I've been unable
> to capture in a test case.
> 
> Sometimes drive2 hangs when drive1 is removed from the throttling group:
> 
>   $ qemu ... -drive if=none,id=drive1,cache=none,format=qcow2,file=data1.qcow2,iops=100,group=foo \
>              -device virtio-blk-pci,id=virtio-blk-pci0,drive=drive1 \
>              -drive if=none,id=drive2,cache=none,format=qcow2,file=data2.qcow2,iops=10,group=foo \
>              -device virtio-blk-pci,id=virtio-blk-pci1,drive=drive2
>   (guest-console1)# fio -filename /dev/vda 4k-seq-read.job
>   (guest-console2)# fio -filename /dev/vdb 4k-seq-read.job
>   (qmp) {"execute": "block_set_io_throttle", "arguments": {"device": "drive1","bps": 0,"bps_rd": 0,"bps_wr": 0,"iops": 0,"iops_rd": 0,"iops_wr": 0}}
> 
> Reported-by: Nini Gu <ngu@redhat.com>
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1535914
> Cc: Alberto Garcia <berto@igalia.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/throttle-groups.c | 4 ++++
>  1 file changed, 4 insertions(+)

Berto is away in July.  I am merging this fix for QEMU 3.0.  If there
are any comments when Berto is back I'll send a follow-up patch.

Applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves
  2018-07-04 14:54 [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2018-07-17  8:11 ` Stefan Hajnoczi
@ 2018-07-31 13:47 ` Alberto Garcia
  2018-07-31 16:47 ` Alberto Garcia
  4 siblings, 0 replies; 7+ messages in thread
From: Alberto Garcia @ 2018-07-31 13:47 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz

On Wed 04 Jul 2018 04:54:10 PM CEST, Stefan Hajnoczi wrote:
> Throttle groups consist of members sharing one throttling state
> (including bps/iops limits).  Round-robin scheduling is used to ensure
> fairness.  If a group member already has a timer pending then other
> groups members do not schedule their own timers.  The next group
> member will have its turn when the existing timer expires.
>
> A hang may occur when a group member leaves while it had a timer
> scheduled.

I haven't been able to reproduce this. When a member is removed from the
group the pending request queue must already be empty, so does this mean
that there's still a timer when the queue is already empty?

Berto

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves
  2018-07-04 14:54 [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2018-07-31 13:47 ` [Qemu-devel] " Alberto Garcia
@ 2018-07-31 16:47 ` Alberto Garcia
  2018-08-01 14:45   ` Alberto Garcia
  4 siblings, 1 reply; 7+ messages in thread
From: Alberto Garcia @ 2018-07-31 16:47 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: qemu-block, Kevin Wolf, Max Reitz

On Wed 04 Jul 2018 04:54:10 PM CEST, Stefan Hajnoczi wrote:
> Throttle groups consist of members sharing one throttling state
> (including bps/iops limits).  Round-robin scheduling is used to ensure
> fairness.  If a group member already has a timer pending then other
> groups members do not schedule their own timers.  The next group
> member will have its turn when the existing timer expires.
>
> A hang may occur when a group member leaves while it had a timer
> scheduled.

Ok, I can reproduce this if I run fio with iodepth=1.

We're draining the BDS before removing it from a throttle group, and
therefore there cannot be any pending requests.

So the problem seems to be that when throttle_co_drain_begin() runs the
pending requests from a member using throttle_group_co_restart_queue(),
it simply uses qemu_co_queue_next() and doesn't touch the timer at all.

So it can happen that there's a request in the queue waiting for a
timer, and after that call the request is gone but the timer remains.

The current patch is perhaps not worth touching at this point (we're
about to release QEMU 3.0), but I think that a better solution would be
to either

a) cancel the existing timer and reset tg->any_timer_armed on the given
   tgm after throttle_group_co_restart_queue() and before
   schedule_next_request() if the queue is empty.

b) force the existing timer to run immediately instead of calling
   throttle_group_co_restart_queue(). Seems cleaner, but I haven't tried
   this one yet.

I'll explore them a bit and send a patch.

Berto

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves
  2018-07-31 16:47 ` Alberto Garcia
@ 2018-08-01 14:45   ` Alberto Garcia
  0 siblings, 0 replies; 7+ messages in thread
From: Alberto Garcia @ 2018-08-01 14:45 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel; +Cc: Kevin Wolf, qemu-block, Max Reitz

On Tue 31 Jul 2018 06:47:53 PM CEST, Alberto Garcia wrote:
> On Wed 04 Jul 2018 04:54:10 PM CEST, Stefan Hajnoczi wrote:
>> Throttle groups consist of members sharing one throttling state
>> (including bps/iops limits).  Round-robin scheduling is used to ensure
>> fairness.  If a group member already has a timer pending then other
>> groups members do not schedule their own timers.  The next group
>> member will have its turn when the existing timer expires.
>>
>> A hang may occur when a group member leaves while it had a timer
>> scheduled.
>
> Ok, I can reproduce this if I run fio with iodepth=1.

I managed to write a test case for this, but unfortunately it seems that
this patch is not enough and it's still possible to hang QEMU 3.0.0-rc2.

I expect to have a fix for tomorrow.

Berto

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-08-01 14:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-04 14:54 [Qemu-devel] [PATCH] throttle-groups: fix hang when group member leaves Stefan Hajnoczi
2018-07-11 12:32 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2018-07-13 12:19 ` Kashyap Chamarthy
2018-07-17  8:11 ` Stefan Hajnoczi
2018-07-31 13:47 ` [Qemu-devel] " Alberto Garcia
2018-07-31 16:47 ` Alberto Garcia
2018-08-01 14:45   ` Alberto Garcia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).