qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures
@ 2016-10-17 15:46 Alberto Garcia
  2016-10-17 15:46 ` [Qemu-devel] [PATCH 1/2] throttle: " Alberto Garcia
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Alberto Garcia @ 2016-10-17 15:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, qemu-stable, Kevin Wolf, Paolo Bonzini,
	Alberto Garcia

Hi all,

Paolo found that commit 27ccdd52598290f introduced a regression in the
throttling code.

It can be easily reproduced in scenarios where you have a throttling
group with several drives but you only write to one of them. In that
case the round-robin algorithm can select the wrong drive all the time
and the actual requests are never completed.

QEMU 2.7 is affected, here's the patch to fix it, plus a test case.

Thanks,

Berto

Alberto Garcia (2):
  throttle: Correct access to wrong BlockBackendPublic structures
  qemu-iotests: Test I/O in a single drive from a throttling group

 block/throttle-groups.c    | 27 +++++++++++++++++++++++----
 tests/qemu-iotests/093     | 33 ++++++++++++++++++++++++++++-----
 tests/qemu-iotests/093.out |  4 ++--
 3 files changed, 53 insertions(+), 11 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH 1/2] throttle: Correct access to wrong BlockBackendPublic structures
  2016-10-17 15:46 [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures Alberto Garcia
@ 2016-10-17 15:46 ` Alberto Garcia
  2016-10-17 15:56   ` Paolo Bonzini
  2016-10-17 15:46 ` [Qemu-devel] [PATCH 2/2] qemu-iotests: Test I/O in a single drive from a throttling group Alberto Garcia
  2016-10-18 14:09 ` [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures Kevin Wolf
  2 siblings, 1 reply; 5+ messages in thread
From: Alberto Garcia @ 2016-10-17 15:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, qemu-stable, Kevin Wolf, Paolo Bonzini,
	Alberto Garcia

In 27ccdd52598290f0f8b58be56e235aff7aebfaf3 the throttling fields were
moved from BlockDriverState to BlockBackend. However in a few cases
the code started using throttling fields from the active BlockBackend
instead of the round-robin token, making the algorithm behave
incorrectly.

This can cause starvation if there's a throttling group with several
drives but only one of them has I/O.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/throttle-groups.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 59545e2..17b2efb 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -168,6 +168,22 @@ static BlockBackend *throttle_group_next_blk(BlockBackend *blk)
     return blk_by_public(next);
 }
 
+/*
+ * Return whether a BlockBackend has pending requests.
+ *
+ * This assumes that tg->lock is held.
+ *
+ * @blk: the BlockBackend
+ * @is_write:  the type of operation (read/write)
+ * @ret:       whether the BlockBackend has pending requests.
+ */
+static inline bool blk_has_pending_reqs(BlockBackend *blk,
+                                        bool is_write)
+{
+    const BlockBackendPublic *blkp = blk_get_public(blk);
+    return blkp->pending_reqs[is_write];
+}
+
 /* Return the next BlockBackend in the round-robin sequence with pending I/O
  * requests.
  *
@@ -188,7 +204,7 @@ static BlockBackend *next_throttle_token(BlockBackend *blk, bool is_write)
 
     /* get next bs round in round robin style */
     token = throttle_group_next_blk(token);
-    while (token != start && !blkp->pending_reqs[is_write]) {
+    while (token != start && !blk_has_pending_reqs(token, is_write)) {
         token = throttle_group_next_blk(token);
     }
 
@@ -196,10 +212,13 @@ static BlockBackend *next_throttle_token(BlockBackend *blk, bool is_write)
      * then decide the token is the current bs because chances are
      * the current bs get the current request queued.
      */
-    if (token == start && !blkp->pending_reqs[is_write]) {
+    if (token == start && !blk_has_pending_reqs(token, is_write)) {
         token = blk;
     }
 
+    /* Either we return the original BB, or one with pending requests */
+    assert(token == blk || blk_has_pending_reqs(token, is_write));
+
     return token;
 }
 
@@ -257,7 +276,7 @@ static void schedule_next_request(BlockBackend *blk, bool is_write)
 
     /* Check if there's any pending request to schedule next */
     token = next_throttle_token(blk, is_write);
-    if (!blkp->pending_reqs[is_write]) {
+    if (!blk_has_pending_reqs(token, is_write)) {
         return;
     }
 
@@ -271,7 +290,7 @@ static void schedule_next_request(BlockBackend *blk, bool is_write)
             qemu_co_queue_next(&blkp->throttled_reqs[is_write])) {
             token = blk;
         } else {
-            ThrottleTimers *tt = &blkp->throttle_timers;
+            ThrottleTimers *tt = &blk_get_public(token)->throttle_timers;
             int64_t now = qemu_clock_get_ns(tt->clock_type);
             timer_mod(tt->timers[is_write], now + 1);
             tg->any_timer_armed[is_write] = true;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH 2/2] qemu-iotests: Test I/O in a single drive from a throttling group
  2016-10-17 15:46 [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures Alberto Garcia
  2016-10-17 15:46 ` [Qemu-devel] [PATCH 1/2] throttle: " Alberto Garcia
@ 2016-10-17 15:46 ` Alberto Garcia
  2016-10-18 14:09 ` [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures Kevin Wolf
  2 siblings, 0 replies; 5+ messages in thread
From: Alberto Garcia @ 2016-10-17 15:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, qemu-stable, Kevin Wolf, Paolo Bonzini,
	Alberto Garcia

iotest 093 contains a test that creates a throttling group with
several drives and performs I/O in all of them. This patch adds a new
test that creates a similar setup but only performs I/O in one of the
drives at the same time.

This is useful to test that the round robin algorithm is behaving
properly in these scenarios, and is specifically written using the
regression introduced in 27ccdd52598290f0f8b58be56e as an example.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 tests/qemu-iotests/093     | 33 ++++++++++++++++++++++++++++-----
 tests/qemu-iotests/093.out |  4 ++--
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/093 b/tests/qemu-iotests/093
index ffcb271..2ed393a 100755
--- a/tests/qemu-iotests/093
+++ b/tests/qemu-iotests/093
@@ -53,7 +53,7 @@ class ThrottleTestCase(iotests.QMPTestCase):
             result = self.vm.qmp("block_set_io_throttle", conv_keys=False, **params)
             self.assert_qmp(result, 'return', {})
 
-    def do_test_throttle(self, ndrives, seconds, params):
+    def do_test_throttle(self, ndrives, seconds, params, first_drive = 0):
         def check_limit(limit, num):
             # IO throttling algorithm is discrete, allow 10% error so the test
             # is more robust
@@ -85,12 +85,14 @@ class ThrottleTestCase(iotests.QMPTestCase):
         # Send I/O requests to all drives
         for i in range(rd_nr):
             for drive in range(0, ndrives):
-                self.vm.hmp_qemu_io("drive%d" % drive, "aio_read %d %d" %
+                idx = first_drive + drive
+                self.vm.hmp_qemu_io("drive%d" % idx, "aio_read %d %d" %
                                     (i * rq_size, rq_size))
 
         for i in range(wr_nr):
             for drive in range(0, ndrives):
-                self.vm.hmp_qemu_io("drive%d" % drive, "aio_write %d %d" %
+                idx = first_drive + drive
+                self.vm.hmp_qemu_io("drive%d" % idx, "aio_write %d %d" %
                                     (i * rq_size, rq_size))
 
         # We'll store the I/O stats for each drive in these arrays
@@ -105,15 +107,17 @@ class ThrottleTestCase(iotests.QMPTestCase):
 
         # Read the stats before advancing the clock
         for i in range(0, ndrives):
+            idx = first_drive + i
             start_rd_bytes[i], start_rd_iops[i], start_wr_bytes[i], \
-                start_wr_iops[i] = self.blockstats('drive%d' % i)
+                start_wr_iops[i] = self.blockstats('drive%d' % idx)
 
         self.vm.qtest("clock_step %d" % ns)
 
         # Read the stats after advancing the clock
         for i in range(0, ndrives):
+            idx = first_drive + i
             end_rd_bytes[i], end_rd_iops[i], end_wr_bytes[i], \
-                end_wr_iops[i] = self.blockstats('drive%d' % i)
+                end_wr_iops[i] = self.blockstats('drive%d' % idx)
 
         # Check that the I/O is within the limits and evenly distributed
         for i in range(0, ndrives):
@@ -129,6 +133,7 @@ class ThrottleTestCase(iotests.QMPTestCase):
             self.assertTrue(check_limit(params['iops_rd'], rd_iops))
             self.assertTrue(check_limit(params['iops_wr'], wr_iops))
 
+    # Connect N drives to a VM and test I/O in all of them
     def test_all(self):
         params = {"bps": 4096,
                   "bps_rd": 4096,
@@ -146,6 +151,24 @@ class ThrottleTestCase(iotests.QMPTestCase):
                 self.configure_throttle(ndrives, limits)
                 self.do_test_throttle(ndrives, 5, limits)
 
+    # Connect N drives to a VM and test I/O in just one of them a time
+    def test_one(self):
+        params = {"bps": 4096,
+                  "bps_rd": 4096,
+                  "bps_wr": 4096,
+                  "iops": 10,
+                  "iops_rd": 10,
+                  "iops_wr": 10,
+                 }
+        # Repeat the test for each one of the drives
+        for drive in range(0, self.max_drives):
+            # Pick each out of all possible params and test
+            for tk in params:
+                limits = dict([(k, 0) for k in params])
+                limits[tk] = params[tk] * self.max_drives
+                self.configure_throttle(self.max_drives, limits)
+                self.do_test_throttle(1, 5, limits, drive)
+
     def test_burst(self):
         params = {"bps": 4096,
                   "bps_rd": 4096,
diff --git a/tests/qemu-iotests/093.out b/tests/qemu-iotests/093.out
index 914e373..2f7d390 100644
--- a/tests/qemu-iotests/093.out
+++ b/tests/qemu-iotests/093.out
@@ -1,5 +1,5 @@
-.....
+.......
 ----------------------------------------------------------------------
-Ran 5 tests
+Ran 7 tests
 
 OK
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] throttle: Correct access to wrong BlockBackendPublic structures
  2016-10-17 15:46 ` [Qemu-devel] [PATCH 1/2] throttle: " Alberto Garcia
@ 2016-10-17 15:56   ` Paolo Bonzini
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2016-10-17 15:56 UTC (permalink / raw)
  To: Alberto Garcia, qemu-devel; +Cc: qemu-block, qemu-stable, Kevin Wolf



On 17/10/2016 17:46, Alberto Garcia wrote:
> In 27ccdd52598290f0f8b58be56e235aff7aebfaf3 the throttling fields were
> moved from BlockDriverState to BlockBackend. However in a few cases
> the code started using throttling fields from the active BlockBackend
> instead of the round-robin token, making the algorithm behave
> incorrectly.
> 
> This can cause starvation if there's a throttling group with several
> drives but only one of them has I/O.
> 
> Reported-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/throttle-groups.c | 27 +++++++++++++++++++++++----
>  1 file changed, 23 insertions(+), 4 deletions(-)
> 
> diff --git a/block/throttle-groups.c b/block/throttle-groups.c
> index 59545e2..17b2efb 100644
> --- a/block/throttle-groups.c
> +++ b/block/throttle-groups.c
> @@ -168,6 +168,22 @@ static BlockBackend *throttle_group_next_blk(BlockBackend *blk)
>      return blk_by_public(next);
>  }
>  
> +/*
> + * Return whether a BlockBackend has pending requests.
> + *
> + * This assumes that tg->lock is held.
> + *
> + * @blk: the BlockBackend
> + * @is_write:  the type of operation (read/write)
> + * @ret:       whether the BlockBackend has pending requests.
> + */
> +static inline bool blk_has_pending_reqs(BlockBackend *blk,
> +                                        bool is_write)
> +{
> +    const BlockBackendPublic *blkp = blk_get_public(blk);
> +    return blkp->pending_reqs[is_write];
> +}
> +
>  /* Return the next BlockBackend in the round-robin sequence with pending I/O
>   * requests.
>   *
> @@ -188,7 +204,7 @@ static BlockBackend *next_throttle_token(BlockBackend *blk, bool is_write)
>  
>      /* get next bs round in round robin style */
>      token = throttle_group_next_blk(token);
> -    while (token != start && !blkp->pending_reqs[is_write]) {
> +    while (token != start && !blk_has_pending_reqs(token, is_write)) {
>          token = throttle_group_next_blk(token);
>      }
>  
> @@ -196,10 +212,13 @@ static BlockBackend *next_throttle_token(BlockBackend *blk, bool is_write)
>       * then decide the token is the current bs because chances are
>       * the current bs get the current request queued.
>       */
> -    if (token == start && !blkp->pending_reqs[is_write]) {
> +    if (token == start && !blk_has_pending_reqs(token, is_write)) {
>          token = blk;
>      }
>  
> +    /* Either we return the original BB, or one with pending requests */
> +    assert(token == blk || blk_has_pending_reqs(token, is_write));

Nice. :-)

>      return token;
>  }
>  
> @@ -257,7 +276,7 @@ static void schedule_next_request(BlockBackend *blk, bool is_write)
>  
>      /* Check if there's any pending request to schedule next */
>      token = next_throttle_token(blk, is_write);
> -    if (!blkp->pending_reqs[is_write]) {
> +    if (!blk_has_pending_reqs(token, is_write)) {
>          return;
>      }
>  
> @@ -271,7 +290,7 @@ static void schedule_next_request(BlockBackend *blk, bool is_write)
>              qemu_co_queue_next(&blkp->throttled_reqs[is_write])) {
>              token = blk;
>          } else {
> -            ThrottleTimers *tt = &blkp->throttle_timers;
> +            ThrottleTimers *tt = &blk_get_public(token)->throttle_timers;
>              int64_t now = qemu_clock_get_ns(tt->clock_type);
>              timer_mod(tt->timers[is_write], now + 1);
>              tg->any_timer_armed[is_write] = true;
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures
  2016-10-17 15:46 [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures Alberto Garcia
  2016-10-17 15:46 ` [Qemu-devel] [PATCH 1/2] throttle: " Alberto Garcia
  2016-10-17 15:46 ` [Qemu-devel] [PATCH 2/2] qemu-iotests: Test I/O in a single drive from a throttling group Alberto Garcia
@ 2016-10-18 14:09 ` Kevin Wolf
  2 siblings, 0 replies; 5+ messages in thread
From: Kevin Wolf @ 2016-10-18 14:09 UTC (permalink / raw)
  To: Alberto Garcia; +Cc: qemu-devel, qemu-block, qemu-stable, Paolo Bonzini

Am 17.10.2016 um 17:46 hat Alberto Garcia geschrieben:
> Hi all,
> 
> Paolo found that commit 27ccdd52598290f introduced a regression in the
> throttling code.
> 
> It can be easily reproduced in scenarios where you have a throttling
> group with several drives but you only write to one of them. In that
> case the round-robin algorithm can select the wrong drive all the time
> and the actual requests are never completed.
> 
> QEMU 2.7 is affected, here's the patch to fix it, plus a test case.

Thanks, applied to the block branch.

Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-10-18 14:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-17 15:46 [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures Alberto Garcia
2016-10-17 15:46 ` [Qemu-devel] [PATCH 1/2] throttle: " Alberto Garcia
2016-10-17 15:56   ` Paolo Bonzini
2016-10-17 15:46 ` [Qemu-devel] [PATCH 2/2] qemu-iotests: Test I/O in a single drive from a throttling group Alberto Garcia
2016-10-18 14:09 ` [Qemu-devel] [PATCH 0/2] Correct access to wrong BlockBackendPublic structures Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).