* [Qemu-devel] [PATCH v2 0/2] block: Fix multiwrite error handling
@ 2010-07-02 12:07 Kevin Wolf
2010-07-02 12:07 ` [Qemu-devel] [PATCH v2 1/2] block: Fix early failure in multiwrite Kevin Wolf
2010-07-02 12:07 ` [Qemu-devel] [PATCH v2 2/2] block: Handle multiwrite errors only when all requests have completed Kevin Wolf
0 siblings, 2 replies; 5+ messages in thread
From: Kevin Wolf @ 2010-07-02 12:07 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
The bdrv_aio_multiwrite error handling has some bugs that lead to premature
cleanup, causing use-after-free and double free problems.
v2:
- Completely replaced patch 1 which Stefan found to be incorrect (thanks for
the good review!). Hope I've got it right this time.
Kevin Wolf (2):
block: Fix early failure in multiwrite
block: Handle multiwrite errors only when all requests have completed
block.c | 40 ++++++++++++++++++++++++++++++----------
1 files changed, 30 insertions(+), 10 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v2 1/2] block: Fix early failure in multiwrite
2010-07-02 12:07 [Qemu-devel] [PATCH v2 0/2] block: Fix multiwrite error handling Kevin Wolf
@ 2010-07-02 12:07 ` Kevin Wolf
2010-07-02 13:18 ` [Qemu-devel] " Stefan Hajnoczi
2010-07-02 12:07 ` [Qemu-devel] [PATCH v2 2/2] block: Handle multiwrite errors only when all requests have completed Kevin Wolf
1 sibling, 1 reply; 5+ messages in thread
From: Kevin Wolf @ 2010-07-02 12:07 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
bdrv_aio_writev may call the callback immediately (and it will commonly do so
in error cases). Current code doesn't consider this. For details see the
comment added by this patch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block.c | 35 +++++++++++++++++++++++++++++------
1 files changed, 29 insertions(+), 6 deletions(-)
diff --git a/block.c b/block.c
index 9176dec..e65971c 100644
--- a/block.c
+++ b/block.c
@@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
// Check for mergable requests
num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb);
- // Run the aio requests
+ /*
+ * Run the aio requests. As soon as one request can't be submitted
+ * successfully, fail all requests that are not yet submitted (we must
+ * return failure for all requests anyway)
+ *
+ * num_requests cannot be set to the right value immediately: If
+ * bdrv_aio_writev fails for some request, num_requests would be too high
+ * and therefore multiwrite_cb() would never recognize the multiwrite
+ * request as completed. We also cannot use the loop variable i to set it
+ * when the first request fails because the callback may already have been
+ * called for previously submitted requests. Thus, num_requests must be
+ * incremented for each request that is submitted.
+ *
+ * The problem that callbacks may be called early also means that we need
+ * to take care that num_requests doesn't become 0 before all requests are
+ * submitted - multiwrite_cb() would consider the multiwrite request
+ * completed. A dummy request that is "completed" by a manual call to
+ * multiwrite_cb() takes care of this.
+ */
+ mcb->num_requests = 1;
+
for (i = 0; i < num_reqs; i++) {
+ mcb->num_requests++;
acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov,
reqs[i].nb_sectors, multiwrite_cb, mcb);
@@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
// We can only fail the whole thing if no request has been
// submitted yet. Otherwise we'll wait for the submitted AIOs to
// complete and report the error in the callback.
- if (mcb->num_requests == 0) {
- reqs[i].error = -EIO;
+ if (i == 0) {
goto fail;
} else {
- mcb->num_requests++;
multiwrite_cb(mcb, -EIO);
break;
}
- } else {
- mcb->num_requests++;
}
}
+ /* Complete the dummy request */
+ multiwrite_cb(mcb, 0);
+
return 0;
fail:
+ for (i = 0; i < mcb->num_callbacks; i++) {
+ reqs[i].error = -EIO;
+ }
qemu_free(mcb);
return -1;
}
--
1.6.6.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] [PATCH v2 2/2] block: Handle multiwrite errors only when all requests have completed
2010-07-02 12:07 [Qemu-devel] [PATCH v2 0/2] block: Fix multiwrite error handling Kevin Wolf
2010-07-02 12:07 ` [Qemu-devel] [PATCH v2 1/2] block: Fix early failure in multiwrite Kevin Wolf
@ 2010-07-02 12:07 ` Kevin Wolf
1 sibling, 0 replies; 5+ messages in thread
From: Kevin Wolf @ 2010-07-02 12:07 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, stefanha
Don't try to be clever by freeing all temporary data and calling all callbacks
when the return value (an error) is certain. Doing so has at least two
important problems:
* The temporary data that is freed (qiov, possibly zero buffer) is still used
by the requests that have not yet completed.
* Calling the callbacks for all requests in the multiwrite means for the caller
that it may free buffers etc. which are still in use.
Just remember the error value and do the cleanup when all requests have
completed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block.c | 5 +----
1 files changed, 1 insertions(+), 4 deletions(-)
diff --git a/block.c b/block.c
index e65971c..dd6dd76 100644
--- a/block.c
+++ b/block.c
@@ -2042,14 +2042,11 @@ static void multiwrite_cb(void *opaque, int ret)
if (ret < 0 && !mcb->error) {
mcb->error = ret;
- multiwrite_user_cb(mcb);
}
mcb->num_requests--;
if (mcb->num_requests == 0) {
- if (mcb->error == 0) {
- multiwrite_user_cb(mcb);
- }
+ multiwrite_user_cb(mcb);
qemu_free(mcb);
}
}
--
1.6.6.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Qemu-devel] Re: [PATCH v2 1/2] block: Fix early failure in multiwrite
2010-07-02 12:07 ` [Qemu-devel] [PATCH v2 1/2] block: Fix early failure in multiwrite Kevin Wolf
@ 2010-07-02 13:18 ` Stefan Hajnoczi
2010-07-02 13:32 ` Kevin Wolf
0 siblings, 1 reply; 5+ messages in thread
From: Stefan Hajnoczi @ 2010-07-02 13:18 UTC (permalink / raw)
To: Kevin Wolf; +Cc: qemu-devel
On Fri, Jul 2, 2010 at 1:07 PM, Kevin Wolf <kwolf@redhat.com> wrote:
> bdrv_aio_writev may call the callback immediately (and it will commonly do so
> in error cases). Current code doesn't consider this. For details see the
> comment added by this patch.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
> block.c | 35 +++++++++++++++++++++++++++++------
> 1 files changed, 29 insertions(+), 6 deletions(-)
>
> diff --git a/block.c b/block.c
> index 9176dec..e65971c 100644
> --- a/block.c
> +++ b/block.c
> @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
> // Check for mergable requests
> num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb);
>
> - // Run the aio requests
> + /*
> + * Run the aio requests. As soon as one request can't be submitted
> + * successfully, fail all requests that are not yet submitted (we must
> + * return failure for all requests anyway)
> + *
> + * num_requests cannot be set to the right value immediately: If
> + * bdrv_aio_writev fails for some request, num_requests would be too high
> + * and therefore multiwrite_cb() would never recognize the multiwrite
> + * request as completed. We also cannot use the loop variable i to set it
> + * when the first request fails because the callback may already have been
> + * called for previously submitted requests. Thus, num_requests must be
> + * incremented for each request that is submitted.
> + *
> + * The problem that callbacks may be called early also means that we need
> + * to take care that num_requests doesn't become 0 before all requests are
> + * submitted - multiwrite_cb() would consider the multiwrite request
> + * completed. A dummy request that is "completed" by a manual call to
> + * multiwrite_cb() takes care of this.
> + */
> + mcb->num_requests = 1;
> +
> for (i = 0; i < num_reqs; i++) {
> + mcb->num_requests++;
> acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov,
> reqs[i].nb_sectors, multiwrite_cb, mcb);
>
> @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
> // We can only fail the whole thing if no request has been
> // submitted yet. Otherwise we'll wait for the submitted AIOs to
> // complete and report the error in the callback.
> - if (mcb->num_requests == 0) {
> - reqs[i].error = -EIO;
> + if (i == 0) {
> goto fail;
> } else {
> - mcb->num_requests++;
> multiwrite_cb(mcb, -EIO);
When bdrv_aio_writev() fails we don't know if the callback has been
invoked by the block driver. Qcow2 will invoke the callback in some
cases. This is a problem because num_requests will be decremented
twice if we unconditionally call it here.
Stefan
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Qemu-devel] Re: [PATCH v2 1/2] block: Fix early failure in multiwrite
2010-07-02 13:18 ` [Qemu-devel] " Stefan Hajnoczi
@ 2010-07-02 13:32 ` Kevin Wolf
0 siblings, 0 replies; 5+ messages in thread
From: Kevin Wolf @ 2010-07-02 13:32 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel
Am 02.07.2010 15:18, schrieb Stefan Hajnoczi:
> On Fri, Jul 2, 2010 at 1:07 PM, Kevin Wolf <kwolf@redhat.com> wrote:
>> bdrv_aio_writev may call the callback immediately (and it will commonly do so
>> in error cases). Current code doesn't consider this. For details see the
>> comment added by this patch.
>>
>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>> ---
>> block.c | 35 +++++++++++++++++++++++++++++------
>> 1 files changed, 29 insertions(+), 6 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 9176dec..e65971c 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
>> // Check for mergable requests
>> num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb);
>>
>> - // Run the aio requests
>> + /*
>> + * Run the aio requests. As soon as one request can't be submitted
>> + * successfully, fail all requests that are not yet submitted (we must
>> + * return failure for all requests anyway)
>> + *
>> + * num_requests cannot be set to the right value immediately: If
>> + * bdrv_aio_writev fails for some request, num_requests would be too high
>> + * and therefore multiwrite_cb() would never recognize the multiwrite
>> + * request as completed. We also cannot use the loop variable i to set it
>> + * when the first request fails because the callback may already have been
>> + * called for previously submitted requests. Thus, num_requests must be
>> + * incremented for each request that is submitted.
>> + *
>> + * The problem that callbacks may be called early also means that we need
>> + * to take care that num_requests doesn't become 0 before all requests are
>> + * submitted - multiwrite_cb() would consider the multiwrite request
>> + * completed. A dummy request that is "completed" by a manual call to
>> + * multiwrite_cb() takes care of this.
>> + */
>> + mcb->num_requests = 1;
>> +
>> for (i = 0; i < num_reqs; i++) {
>> + mcb->num_requests++;
>> acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov,
>> reqs[i].nb_sectors, multiwrite_cb, mcb);
>>
>> @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs)
>> // We can only fail the whole thing if no request has been
>> // submitted yet. Otherwise we'll wait for the submitted AIOs to
>> // complete and report the error in the callback.
>> - if (mcb->num_requests == 0) {
>> - reqs[i].error = -EIO;
>> + if (i == 0) {
>> goto fail;
>> } else {
>> - mcb->num_requests++;
>> multiwrite_cb(mcb, -EIO);
>
> When bdrv_aio_writev() fails we don't know if the callback has been
> invoked by the block driver. Qcow2 will invoke the callback in some
> cases. This is a problem because num_requests will be decremented
> twice if we unconditionally call it here.
Talked to Stefan on IRC and we came to the conclusion that it's not a
problem in fact: qcow_aio_writev() either returns NULL or calls a
callback, but it never does both.
If a block driver returned NULL and called a callback for the same
request that would be a bug in the block driver.
Kevin
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-07-02 13:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-02 12:07 [Qemu-devel] [PATCH v2 0/2] block: Fix multiwrite error handling Kevin Wolf
2010-07-02 12:07 ` [Qemu-devel] [PATCH v2 1/2] block: Fix early failure in multiwrite Kevin Wolf
2010-07-02 13:18 ` [Qemu-devel] " Stefan Hajnoczi
2010-07-02 13:32 ` Kevin Wolf
2010-07-02 12:07 ` [Qemu-devel] [PATCH v2 2/2] block: Handle multiwrite errors only when all requests have completed Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).