From: Paolo Bonzini <pbonzini@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>, Laszlo Ersek <lersek@redhat.com>
Cc: qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 1/3] qcow2: Check bs->drv in copy_sectors()
Date: Tue, 11 Mar 2014 15:04:31 +0100 [thread overview]
Message-ID: <531F17EF.9000500@redhat.com> (raw)
In-Reply-To: <20140311101604.GB3215@dhcp-200-207.str.redhat.com>
Il 11/03/2014 11:16, Kevin Wolf ha scritto:
> Am 11.03.2014 um 00:16 hat Laszlo Ersek geschrieben:
>> On 03/10/14 23:44, Max Reitz wrote:
>>> Before dereferencing bs->drv for a call to its member bdrv_co_readv(),
>>> copy_sectors() should check whether that pointer is indeed valid, since
>>> it may have been set to NULL by e.g. a concurrent write triggering the
>>> corruption prevention mechanism.
>>>
>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>> ---
>>> To be precise, this still is a race condition. If bs->drv is set to NULL
>>> after the check and before the call to bdrv_co_readv(), QEMU will
>>> obviously still crash. However, in order to circumvent this behavior, we
>>> would probably have to re-lock s->lock, check bs->drv, take the function
>>> pointer to bdrv_co_readv() and then unlock s->lock before the function
>>> is called. I found this rather ugly and therefore this still has a very
>>> small chance of running into a race condition.
>>> Therefore, I'm asking for your opinion on this, whether we can really
>>> take this chance or should rather "do it right". In fact, if I were a
>>> reviewer, I'd probably reject this patch and request the solution with
>>> the function pointer (if there is no better solution), but I was afraid
>>> to send such an ugly patch.
>
> No, the code is fine. Remember that qcow2 is not threaded, we're talking
> about coroutines here. There is no way for the code to yield between
> your check and the protected place.
>
>>> block/qcow2-cluster.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>>> index 36c1bed..9499df9 100644
>>> --- a/block/qcow2-cluster.c
>>> +++ b/block/qcow2-cluster.c
>>> @@ -380,6 +380,10 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
>>>
>>> BLKDBG_EVENT(bs->file, BLKDBG_COW_READ);
>>>
>>> + if (!bs->drv) {
>>> + return -ENOMEDIUM;
>>> + }
>>> +
>>> /* Call .bdrv_co_readv() directly instead of using the public block-layer
>>> * interface. This avoids double I/O throttling and request tracking,
>>> * which can lead to deadlock when block layer copy-on-read is enabled.
>>>
>>
>> I can't answer your question nor review this patch -- instead, I have a
>> question of my own: when you say "set to NULL by [...] the corruption
>> prevention mechanism", do you mean qcow2_pre_write_overlap_check():
>>
>> bs->drv = NULL; /* make BDS unusable */
>
> Yes, this is the place.
>
>> If so: I thought that it was quite a bold move, but also that we'd find
>> the SIGSEGVs sooner or later... :)
>
> In fact, if you use the block layer API, most functions check for
> bs->drv and return -ENOMEDIUM if it is NULL. The problem here is that we
> directly dereference the pointer without going through block.c (there's
> a good reason for this, see the comment, but it still makes it somewhat
> special).
But why not call qcow2_co_readv directly?
Paolo
next prev parent reply other threads:[~2014-03-11 14:04 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-03-10 22:44 [Qemu-devel] [PATCH 0/3] qcow2: Fix crash during copy_sectors() after corruption Max Reitz
2014-03-10 22:44 ` [Qemu-devel] [PATCH 1/3] qcow2: Check bs->drv in copy_sectors() Max Reitz
2014-03-10 23:16 ` Laszlo Ersek
2014-03-11 10:16 ` Kevin Wolf
2014-03-11 14:04 ` Paolo Bonzini [this message]
2014-03-11 18:19 ` Max Reitz
2014-03-11 18:09 ` Max Reitz
2014-03-10 22:44 ` [Qemu-devel] [PATCH 2/3] block: bs->drv may be NULL in bdrv_debug_resume() Max Reitz
2014-03-10 22:44 ` [Qemu-devel] [PATCH 3/3] iotests: Test corruption during COW request Max Reitz
2014-03-11 10:21 ` [Qemu-devel] [PATCH 0/3] qcow2: Fix crash during copy_sectors() after corruption Kevin Wolf
2014-03-11 13:13 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=531F17EF.9000500@redhat.com \
--to=pbonzini@redhat.com \
--cc=kwolf@redhat.com \
--cc=lersek@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).